Exchange of previously-sent data is in Subversion.

Juli Mallett juli at clockworksquid.com
Sun Oct 18 03:30:53 PDT 2009


Hey folks,

In preparation for 0.6.0 the wire protocol has changed again, but
there's finally a hackish implementation of the exchange of
previously-sent data.  It's likely still buggy but I have been testing
it with some pretty demanding web traffic and feel like I have found
and fixed most of the likely problem areas.  Note that the new wire
protocol has lower overhead overall.

The implications of this are: if for whatever reason, one side doesn't
know about data that the other side is referring to, they will go back
and forth until they're on the same page.  There's some cases where
some sort of arbitration is needed (I use hash A to refer to data D
and you use hash A to refer to data E) and that isn't at all
implemented at the moment.  What are some common cases where this
happens?

Well, if you have two WANProxy systems and you restart one, previously
you would have to restart the the other one, too.  Now you don't.

If you have two WANProxy systems and some connections between them get
reset, you might miss the definition of a hash on one side, and the
other side has no way of knowing.  Now that's not a problem.

Any time this exchange of previously sent data comes in to play,
things will run slower, but it should be a transient condition.

A technical note, wrt the arbitration issue: it should be easy enough
to just keep a list of hashes to blackhole in the encoder so we don't
use hash A when encoding if the other side disagrees with us about the
data it refers to, and then in the decoder we can use that same list
of hashes to decode data from the remote side based on what it seems
to intend.

Anyway, if you've been thinking about putting WANProxy into
production, you probably want to try what's in Subversion now, as it's
a lot more productionable.  Or you can wait for 0.6.0.

Thanks,
Juli.



More information about the wanproxy mailing list