Optimizing file synchronization over HTTP

Question

I'm attempting to synchornize a set of files over HTTP.
For the moment, I'm using HTTP PUT, and sending files that have been altered. However, this is very inefficient when synchronizing large files where the delta is very small.

I'd like to do something closer to what rsync does to transmit the deltas, but I'm wondering what the best approach to do this would be.

I know I could use an rsync library on both ends, and wrap their communication over HTTP, but this sounds more like an antipattern; tunneling a standalone protocol over HTTP. I'd like to do something that's more in line with how HTTP works, and not wrap binary data (except my files, duh) in an HTTP request/response.

I've also failed to find any relevant/useful functionality already implemented in WebDAV.

I have total control over the client and server implementation, since this is a desktop-ish application (meaning "I don't need to worry about browser compatibility").

Szocske · Accepted Answer

The HTTP PATCH recommended in a comment requires the client to keep track of local changes. You may not be able to do that due to the size of the file.

Alternatively you could treat "chunks" of the huge file as resources: depending on the nature of the changes and the content of the file it could be by bytes, chapters, whatever.

The client could query the hash of all chunks, calculate the same for the local version, and PUT only the changed ones.

Optimizing file synchronization over HTTP

Answers (1)

Related Questions