XCodec segment size: request for comment.

Mon Jan 3 10:45:18 PST 2011

Hi Juli,

That is basically it - and I understand optimizing for specific
traffic types would require customizations.

I will look into getting some data points with the changes you have suggested.

thanks,

-Uttam

On Mon, Jan 3, 2011 at 2:51 AM, Mallett, Juli <juli at clockworksquid.com> wrote:
> Hi Uttam,
>
> If I understand what you're saying, your concern is that there are a
> lot of protocols which might put ephemeral data (say, checkums, HMACs,
> unique identifiers, cookies, etc.) less than every 2048 bytes of data,
> is that right?  Protocol-specific modules would fix that, but
> obviously that is a major investment in time, even just to do simple
> framing to avoid compressing the ephemeral parts.  That's a good point
> and one I hadn't thought of recently but which went in to the original
> chunk size choice.
>
> Of course, the ideal would be to support a variety of chunk sizes, or
> at least a couple, say 2048 and 128, but that's non-trivial in terms
> of development, too.
>
> If you want to do a couple of tests with what's in Subversion and then
> with the chunk size changed, make these modifications:
>
> o) In xcodec/xcodec.h define XCODEC_SEGMENT_SIZE as 2048 rather than 128.
> o) In xcodec/xcodec_hash.h in XCodecHash::mix, replace the bitwise ors
> with adds (| into +).
>
> I hope the CIFS block size is big enough that 2KB wouldn't be
> disruptive, and I'd be happy to have more empirical data.  We don't
> need to be able to deduplicate *every* message, just the ones that are
> big enough to matter.
>
> Thanks,
> Juli.
>
> On Mon, Jan 3, 2011 at 00:41, Uttam Singh <us at iptvlabs.com> wrote:
>> Hi Juli,
>>
>> I have not seen any response from others on the list, maybe they are
>> sending direct replies to you and not cc: wanproxy list. Would
>> appreciate inputs and application scenarios that others are tackling.
>>
>> I have experimented with wanproxy for SMB/CIFS traffic - common with
>> Windows based client/server networking. Its still in a test network so
>> I don't have real world results.
>>
>> I have a feeling that at least in my scenario, 2KB hash size may not
>> be efficient. I think bsd mbuf size selection (2KB) was mostly done to
>> efficiently move standard ethernet frames. For WAN traffic where L5
>> protocol headers (e.g. SMB) encapsulate user data in each packet -
>> smaller hashing size would be better.
>>
>> Given your thoughts on overhead ratios - one could argue for 256 or,
>> even 512 Byte. It would definitely help system performance and memory
>> usage if there was a close match between Xcodec buffers and system
>> buffers.
>>
>> I would be willing to capture large sample data (gathered over couple
>> of days on live network) with tcpdump if you have a way of running it
>> through a codec analyzer for any useful information.
>>
>> I will take a look at 0.7.0 or, 0.7.1 in a couple of days.
>>
>> thanks,
>>
>> -Uttam
>>
>>
>> On Sat, Jan 1, 2011 at 10:39 PM, Mallett, Juli <juli at clockworksquid.com> wrote:
>>> Hey folks,
>>>
>>> I'm starting to consider revisiting a very, very old decision within
>>> WANProxy and would love to hear some feedback from users.
>>>
>>> I recently moved to using 2KB buffers internally for data, which means
>>> that 2KB is the smallest amount of data that the XCodec can store
>>> internally, and that reads, etc., happen with 2KB buffers (with some
>>> caveats.)  This size was chosen empirically, but it's also the
>>> traditional size for BSD mbufs, so it's fairly time-honored.  Using
>>> 2KB buffers, simple I/O test programs using the WANProxy event system
>>> were much faster than the old size, 128 bytes.
>>>
>>> The problem is that the XCodec still hashes data in 128 byte blocks.
>>> So although the XCodec has to keep data around in 2KB chunks, it's
>>> only using 128 bytes of each chunk, leading to rather a lot of
>>> overhead.  A 1MB file takes up 16MB in RAM, which obviously is too
>>> much overhead to bear.  I meant to revisit the matter before releasing
>>> 0.7.0 but, frankly, forgot.
>>>
>>> So 2KB is very good for I/O, but those I/O benchmarks don't matter,
>>> WANProxy is what matters.  So perhaps I should move back to using 128
>>> byte buffers, so that XCodec's chunk size matches the buffer size, or
>>> I should put some time into a couple of more complicated solutions
>>> that would allow multiple XCodec chunks in a single byte buffer, and
>>> also speed up some common operations in the encoder.
>>>
>>> But that leads to the question: why use 128 byte chunks in the XCodec?
>>>  The original rationale was that that meant that any deltas would be
>>> very small -- if you send a file once and then change a byte or insert
>>> or remove a byte in the middle, the most we'll have to send in the
>>> clear is that new 128 byte chunk until the XCodec picks up the
>>> original trail.  128 bytes isn't much.  But it also gives a high
>>> overhead ratio in the protocol -- we need two bytes of overhead just
>>> to 'extract' a chunk from the stream with XCodec, and 2 bytes of
>>> overhead for each 128 bytes isn't nothing.
>>>
>>> So why not just use 2KB chunks in XCodec and 2KB buffers?  Well,
>>> because then if there's a change or deletion or insertion we're
>>> inefficient over a much larger area.  On the other hand, it speeds up
>>> encoding considerably: a 128MB random file that I have takes 25
>>> seconds to encode with 128 byte chunks but only 1 second with 128 byte
>>> chunks.  It reduces overhead, so we have 2 bytes of overhead for an
>>> 'extract' of 2048 bytes; referencing a 128 byte segment from the
>>> distant past takes 10 bytes now, but we could reference 2KB with 10
>>> bytes instead.
>>>
>>> Of course, that overhead for changes really is unacceptable.  I mean,
>>> with smaller block sizes like 128 bytes, we even manage to compress
>>> some files on the first transmission, which is less likely for higher
>>> block sizes.  But is the XCodec really the best tool for that job?  We
>>> have zlib now, but zlib is slow.  But, on the other hand, if the
>>> XCodec is faster, we can spare some time for zlib, and let zlib clean
>>> up the mess if we can't compress some data.  Which is even more
>>> convincing given that we gain so much protocol efficiency from using
>>> 10 bytes (or 2 bytes) to reference so much more data.
>>>
>>> Even running zlib at very low compressor levels, which cost very
>>> little CPU time, could probably make up for the differences.  Keep in
>>> mind that the XCodec was originally developed to help me store
>>> backups, and it didn't support zlib and I didn't have a good way to do
>>> zlib after XCodec externally.  But more and more it seems
>>> inappropriate to use such a small chunk size for XCodec.
>>>
>>> With 128 byte chunks and zlib today, one of my test setups with random
>>> data files over HTTP manages ~10MB/s for previously-sent data, but a
>>> quick test with 2KB chunks gives 33MB/s, and could probably go further
>>> if I were using a test file large enough for the TCP window to grow
>>> more.  And that's also with zlib at a ridiculously-high level: 9.
>>>
>>> If there's nobody out there with a good argument against the change, I
>>> will probably make it within the next day or two and release 0.7.1,
>>> along with a fix that improves performance in XCodec+zlib setups.
>>>
>>> The biggest drawback is that much of the time I've invested into
>>> trying to reduce collisions in the hash algorithm is lost and I'll
>>> have to rethink some of the design decisions there.  But the existing
>>> hash algorithm doesn't do too badly and the implications of a larger
>>> block size for the hashes are a little fuzzy to me -- there's more
>>> bits going into a hash of the same size, but the probability of
>>> collisions in practice may be better.
>>>
>>> Thanks,
>>> Juli.
>>> _______________________________________________
>>> wanproxy mailing list
>>> wanproxy at lists.wanproxy.org
>>> http://lists.wanproxy.org/listinfo.cgi/wanproxy-wanproxy.org
>>>
>>
>