juli at clockworksquid.com
Tue Oct 4 19:14:18 PDT 2011
It's a trade-off, and ideally I'd put in the effort to make this
runtime tunable without sacrificing too much performance (this
actually isn't very hard; I'll spare you the details, but it's
something I could conceivably do in a weekend just by templating all
You might find reading this thread informative (start at the bottom):
The short of it is that we use fewer segments, can keep smaller tables
in memory, have much better performance (throughput), etc., if we use
a larger segment size. 2KB was chosen empirically and using a bit of
math to look at what the file-size impact would be.
Using a smaller chunk, of course, increases the possibility of finding
chunks that have been seen before. If a file is identical, though,
using 2KB segments should result in a smaller file overall. It's only
in the case of coincidence or a modified file that the smaller segment
size makes a big difference. And, really, that should only impact a
small number of segments around the change, which is not very much in
terms of bytes.
If you do change the XCodec segment size, you need to change the
buffer segment size in common/buffer.h as well and rebuild everything.
Otherwise, you're wasting a lot of RAM.
And on top of all that, whatever the segment size, we also now support
zlib compression. Using 2KB segments and zlib compression should give
similar throughput to using just 128 byte segments and no zlib, and
yield much better reduction in data size.
I'm happy to discuss it and consider changes — let me know what you think!
On Tue, Oct 4, 2011 at 18:40, Diego Woitasen <diego at woitasen.com.ar> wrote:
> I was testing the Wanproxy compression and have questions about the
> segment size. The test was done with traffic captured in one of the
> branches of one of my clients. I used chaosreader to reassemble the
> With the default segment size (2048) the saving is 67%. I tried with
> segment sizes of 1024 and 512. The saving is 71% and 78% respectively.
> I haven't test this on the wire yet, I'll do it tomorrow. In the
> meanwhile, my question is: what's the reason of a segment size of
> 2048? Is it based on the experience?
> Diego Woitasen
> wanproxy mailing list
> wanproxy at lists.wanproxy.org
More information about the wanproxy