where does wanproxy cache files
aquarypbx at gmail.com
Thu Mar 14 00:41:37 PDT 2013
We came up with another question:
We tested sending 2G files to the "client", the second time the
transmission from "client" to "server" was faster , but from "server" to
"intranet" , it was still the same speed as first time. So wanproxy is only
optimizing traffic between "client" and "server" , but not "server" and
"intranet" , correct? If so what is the purpose of the "intranet " in this
On Fri, Mar 8, 2013 at 11:05 PM, Juli Mallett <juli at clockworksquid.com>wrote:
> On Fri, Mar 8, 2013 at 10:54 PM, Boxiang Pan <aquarypbx at gmail.com> wrote:
> > Hi, Juli,
> > We have successfully used wanproxy to transfer files from client to the
> > intranet through the server. And we managed to time the transmission
> > The second time, it indeed transferred faster. We've also managed to run
> > wanproxy from our program. Thanks a lot for all the help!
> > We have a few more questions about how wanproxy is implemented at top
> > 1) Since it takes less time to transfer the second time, we assume that
> > file is cached somewhere, so where is the file cached? Is it cached on
> > "server", which is between the "client" and the "intranet"?
> As you may have noticed, WANProxy works independently of the protocol
> being used. It has no notion of files, only data.
> Both client and server remember the data that has been sent between
> them in the past. They remember it in blocks of 2KB, and each block
> has a unique 64-bit name derived from the file hash. When data is
> being transferred between them, it is split up into blocks so that
> where possible they only transfer the 64-bit name for parts of the
> data being transferred, rather than the whole 2KB block. Where data
> has been inserted, removed or changed, the blocks around it will still
> be replaced with their names, but the new or changed piece will be
> transferred for the first time.
> > 2) In the scenario where a client is periodically backing up a large
> > say 2GB, to the "intranet", but each backup, the file only differs a
> > portion (100MB) from the previous version, does wanproxy treat each
> > file as a completely different file and cache the entire file again, or
> > it smartly only updated the new 100MB in the previously cached file? If
> > how does wanproxy tells the difference?
> Why don't you try it? :) The above probably makes the answer
> obvious, but just to be clear:
> The data that are unchanged, that have been sent previously, will be
> replaced with their names, and the data which are new or which are
> changed will be transferred over the link and remembered for later
> use. So if a different 100MB changes every time, you should still
> only need to send about 100MB of data plus the names that correspond
> to the rest of it.
> > 3) If there have already been a lot of files cached on the "server", the
> > next time the "client" is sending a file to the "server", how does the
> > server know if that file has previously been cached?
> See above. WANProxy does deduplication of TCP streams and is not file
> oriented in any way. As far as it knows, it could be deduplicating an
> IRC session, E-Mail, web traffic, file replication, etc.
> > 4) Is there any reference document/ readme about the implementation of
> > wanproxy that we may refer to?
> I have sent other information to the mailing lists in the past. You
> can also read about how 'rsync' works, although rsync is
> file-oriented, but the algorithm is basically the same.
> There is a little bit of information here, but it is out of date about
> protocol details:
> There have also been some posts to the mailing list, but there's not
> any one message that I would point you to. I'm happy to answer any
> more questions.
Department of Electrical and Computer Engineering
University of California, San Diego
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the wanproxy