Eamon
Eamon

Reputation: 53

Block Level Copying and Rsync

I am trying to use grsync (A GUI for rsync) for Windows to run backups. In the directory that I am backing up there are many larger files that are updated periodically. I would like to be able to sync just the changes to those files and not the entire file each backup. I was under the impression that rsync is a block-level file copier and would only copy the bytes that had changed between each sync. Perhaps this is not the case, or I have misunderstood what block-level file coping is!

To test this I used grsync to synchronize a 5GB zip file between two directories. Then I added a very small text file to the zip file and ran grsync again. However it proceeded to copy over the entire zip file again. Is there a utility that would only copy over the changes to this zip file and not the entire file again? Or is there a command within grsync that could be used to this effect?

Upvotes: 5

Views: 8921

Answers (4)

Jason Stewart
Jason Stewart

Reputation: 383

@user5063219 rightly states that a minor change to a zip file can shuffle the blocks such that they no longer match up--turning rsync into a slower version of cp. If you could use a recent version of gzip or zstd instead of zip, both of those have an --rsyncable flag, which structures the archive in a more rsync-friendly way.

If the files you are trying to rsync are encrypted, small changes to the plaintext would also drastically change the file in an rsync-unfriendly way.

Upvotes: 0

ben utzer
ben utzer

Reputation: 1

I was just looking for this myself, I think you have to use

rsync -av --inplace

for this to work.

Upvotes: -2

Chris Davies
Chris Davies

Reputation: 644

The reason the entire file was copied is simply that the algorithm that handles block-level changes is disabled when copying between two directories on a local filesystem.

This would have worked, because the file is being copied (or updated) to a remote system:

rsync -av  big_file.zip remote_host:

This will not use the "delta" algorithm and the entire file will be copied:

rsync -av  big_file.zip D:\target\folder\

Some notes

  1. Even if the target is a network share, rsync will treat it as path of your local filesystem and will disable the "delta" (block changes) algorithm.
  2. Adding data to the beginning or middle of a data file will not upset the algorithm that handles the block-level changes.

Rationale

The delta algorithm is disabled when copying between two local targets because it needs to read both the source and the destination file completely in order to determine which blocks need changing. The rationale is that the time taken to read the target file is much the same as just writing to it, and so there's no point reading it first.

Workaround

If you know for definite that reading from your target filesystem is significantly faster than writing to it you can force the block-level algorithm to run by including the --no-whole-file flag.

Upvotes: 10

user5063219
user5063219

Reputation: 29

If you add a file to a zip the entire zip file can change if the file was added as the first file in the archive. The entire archive will shift. so yours is not a valid test.

Upvotes: 2

Related Questions