Wanproxy on Fedora/Redhat

Juli Mallett juli at clockworksquid.com
Tue Dec 8 17:07:18 PST 2009


Hi William,

At some point I intend to implement standard compression on top of the
existing algorithm (or to at least use traditional compression for data that
has not been seen before.)  Note that WANProxy does deduplication, not
compression.  I'd be surprised if LZF does as well for a real world data set
in a WAN compression sense.

For a toy example, consider two people in a remote office reviewing a
document that is stored on a central server.  Let's assume it's a 22M PDF —
I'm using the German-language documentation for the Propellerhead Reason
software as an example because it's the biggest PDF I have on my hard drive.
 With no WAN compression, each time it's downloaded that's 22M of traffic
that's going to flow over your link.

Now assume that you have a persistent 'gzip -9' of your traffic going on.
 The first time somebody downloads it, it's going to take 14M of traffic.
 The second time somebody downloads it, it'll take about that same amount.

With the XCodec (as is used in tack and wanproxy today), things are a little
bit different because the dictionary that is used for compression is
persistent.  The first time somebody downloads it, it's going to take ~15.5M
of traffic.  The second time, though, mostly only references to the
previously-sent chunks need to be sent, less than 2M of traffic.

The simplest way to test this is to take your data set (a tar file of the
whole thing if you want some idea of how it'll do for all of your data, or
you can go a file-at-a-time) and do something like:

tack -c dataset > dataset.t
tack -c dataset dataset > dataset.t-2

Then look at the difference in size.  That difference is the amount of data
that second (or third or fourth or ...) download is going to take.  Note
that each file passed on the command-line to tack is like a new connection —
it uses the same persistent dictionary but its own connection-specific
state.  If you want to see what downloading the same file twice over the
same connection looks like, do something more like 'cat dataset dataset |
tack -c > dataset.t-3' — it should be similar, though in some cases (very
low-entropy files) it might be better.

Note that the XCodec is pretty smart and will exhaustively look for chunks
of data it has seen before when encoding, so you can change things in your
file or add random data to the middle and still see substantial
deduplication.

Thanks,
Juli.

On Tue, Dec 8, 2009 at 16:48, Tan, William <wtan at eci.com> wrote:

>  It’s no big deal, but thank you to everyone for the responses.   We have
> a VMWare farm, so it only took me about 15 minutes to build a Ubuntu system
> and compile Wanproxy.
>
>
>
> I was interested in testing the compression algorithm against our sample
> data set, so tack was all I really ended up needing.   As it turns out in
> our case, LZF compression seems to be more effective than the Wanproxy
> compression algorithm.   I’m sure that the best algorithm would vary
> depending on the use case.
>
>
>
> Perhaps as part of the dev roadmap, it might make sense to implement LZF as
> an option?
>
>
>
>
>
>
>
> *From:* Juli Mallett [mailto:juli at clockworksquid.com]
> *Sent:* Tuesday, December 08, 2009 6:55 PM
> *To:* Tan, William
> *Cc:* wanproxy at lists.wanproxy.org
> *Subject:* Re: Wanproxy on Fedora/Redhat
>
>
>
> Sorry for the trouble — I've started work on a GNU Make build system but
> have run into a problem that is probably solvable differently in a trivial
> way than BSD Make, but because I'm used to the latter I haven't looked up
> how to do it properly in the former — defining a target only if no target by
> that name has been defined.  I think I might have also had some trouble
> figuring out what to do with something like '.if make(foo)' in BSD Make, but
> I don't remember if WANProxy uses any of those.
>
>
>
> Anyway, here's a trivial GNU-style Makefile that will probably work:
>
>
>
> %%%
>
> SRCS= proxy_client.cc proxy_listener.cc proxy_socks_connection.cc
> proxy_socks_listener.cc wanproxy.cc wanproxy_config.cc
> wanproxy_config_class_codec.cc wanproxy_config_class_interface.cc
> wanproxy_config_class_peer.cc wanproxy_config_class_proxy.cc
> wanproxy_config_class_proxy_socks.cc wanproxy_config_type_codec.cc
> ../../common/buffer.cc ../../common/log.cc ../../common/timer.cc
> ../../config/config.cc ../../config/config_class.cc
> ../../config/config_class_log_mask.cc ../../config/config_object.cc
> ../../config/config_type_int.cc ../../config/config_type_log_level.cc
> ../../config/config_type_pointer.cc ../../config/config_type_string.cc
> ../../config/config_class_address.cc
> ../../config/config_type_address_family.cc ../../event/event_poll.cc
> ../../event/event_system.cc ../../event/timeout.cc
> ../../event/event_poll_poll.cc ../../io/file_descriptor.cc
> ../../io/io_system.cc ../../io/pipe_link.cc ../../io/pipe_null.cc
> ../../io/pipe_sink.cc ../../io/pipe_pair_echo.cc ../../io/pipe_simple.cc
> ../../io/socket.cc ../../io/splice.cc ../../io/splice_pair.cc
> ../../io/unix_client.cc ../../io/unix_server.cc ../../net/tcp_client.cc
> ../../net/tcp_server.cc ../../net/udp_client.cc ../../net/udp_server.cc
> ../../xcodec/xcodec_decoder.cc ../../xcodec/xcodec_encoder.cc
> ../../xcodec/xcodec_decoder_pipe.cc ../../xcodec/xcodec_encoder_pipe.cc
>
> CXXFLAGS=-I../.. -include common/common.h -DXCODEC_PIPES -DUSE_POLL_POLL
>
> wanproxy:
>
>             c++ -o wanproxy ${CXXFLAGS} ${SRCS} -lrt
>
> %%%
>
>
>
> Just put that in 'GNUmakefile' in programs/wanproxy and run 'make' there.
>  Obviously you could also just invoke c++ by hand, given that level of
> sophistication.
>
>
>
> On Tue, Dec 8, 2009 at 08:44, Tan, William <wtan at eci.com> wrote:
>
> I haven’t had any luck building wanproxy (0.6.0) on Linux (Fedora 8-11 to
> be specific).  I am experiencing the same problem described by previous
> posters with pmake not liking the .include.
>
>
>
> Has anyone successfully built on a recent 2.6 linux kernel, and can they
> provide any guideance?
>
>
>
> Thanks.
>
>
>
>
>
>
> NOTICE: The information contained in this transmission is privileged,
> confidential, and intended only for the use of the individual or entity
> named above. If you are not the intended recipient, you are hereby notified
> that any disclosure, copying, distribution, or the taking of any action in
> reliance on the contents of this transmission is strictly prohibited. If you
> have received this transmission in error, please notify Eze Castle
> Integration, Inc. by e-mail and destroy the original message and all copies.
> Thank you.
>
>
>
> _______________________________________________
> wanproxy mailing list
> wanproxy at lists.wanproxy.org
> http://lists.wanproxy.org/listinfo.cgi/wanproxy-wanproxy.org
>
>
>
> NOTICE: The information contained in this transmission is privileged,
> confidential, and intended only for the use of the individual or entity
> named above. If you are not the intended recipient, you are hereby notified
> that any disclosure, copying, distribution, or the taking of any action in
> reliance on the contents of this transmission is strictly prohibited. If you
> have received this transmission in error, please notify Eze Castle
> Integration, Inc. by e-mail and destroy the original message and all copies.
> Thank you.
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.wanproxy.org/pipermail/wanproxy-wanproxy.org/attachments/20091208/d488b69c/attachment-0005.htm>


More information about the wanproxy mailing list