invinciblecache
invinciblecache

Reputation: 113

linux - after rsync, du shows size difference when diff does not

I copied a large folder from NTFS to ext4 using 'rsync' and validating it with 'diff'. Just for the shake of curiosity, I also used 'du' command to check if folders had the same size. While 'diff' didn't show any difference, 'du' showed that folders had different sizes. I did not encounter any errors while executing the following commands.

rsync --archive --recursive "$src" "$dest" 2>rsync_error.txt

sync

diff --brief --recursive --new-file "$src" "$dest" 1>diff-log.txt 2>diff-error.txt

Then I used 'du' for each folder:

du -sb "$src"
du -sb "$dest"
Output:
137197597476
137203512004

1.Why would this happen since there is not any difference?

2.Should I be worried about my data or my system?

EDIT: I also tried du -s --apparent-size and there is still difference.

Upvotes: 3

Views: 2507

Answers (3)

F. Hauri  - Give Up GitHub
F. Hauri - Give Up GitHub

Reputation: 70977

Sparses files

Under linux, you could create so-called sparse files. They are files where full NULL block don't really exists!

Try this:

$ dd if=/dev/zero count=2048 of=normalfile
2048+0 records in
2048+0 records out
1048576 bytes (1.0 MB, 1.0 MiB) copied, 0.0103269 s, 102 MB/s

and

$ dd if=/dev/zero count=0 seek=2048 of=sparsefile
0+0 records in
0+0 records out
0 bytes copied, 0.000182708 s, 0.0 kB/s

then

$ ls -l sparsefile normalfile
-rw-r--r-- 1 user  user  1048576 Feb  3 17:53 normalfile
-rw-r--r-- 1 user  user  1048576 Feb  3 17:53 sparsefile

$ du -b sparsefile normalfile
1048576     sparsefile
1048576     normalfile

but

$ du -k sparsefile normalfile
0   sparsefile
1024        normalfile

$ du -h sparsefile normalfile
0   sparsefile
1.0M        normalfile

So long block in sparsefile are not used, they will not be allocated!

$ du -k --apparent-size sparsefile normalfile
1024        sparsefile
1024        normalfile

Then

$ diff sparsefile normalfile
echo $?
0

There is virtually no difference between both files!

Further

$ /sbin/mkfs.ext4 sparsefile 
mke2fs 1.44.5 (15-Dec-2018)
Filesystem too small for a journal
...
Writing superblocks and filesystem accounting information: done

$ ls -l sparsefile normalfile 
-rw-r--r-- 1 user  user  1048576 Feb  3 17:53 normalfile
-rw-r--r-- 1 user  user  1048576 Feb  3 17:59 sparsefile

$ du -k sparsefile 
32  sparsefile

$ diff sparsefile normalfile
Binary files sparsefile and normalfile differ

Upvotes: 2

Rafael Aguilar
Rafael Aguilar

Reputation: 3279

Greettings Invinciblecache,

Googling around I've found this:

As du reports allocation space and not absolute file space, the amount of space on a file system shown by du may vary from that shown by df if files have been deleted but their blocks not yet freed. source

Not the best source but is a great description of what du is used for.

So, I'd rely on diff to check the content of the files, but I would recommend to ignore size difference on filesystem unless it is too high, which is not this the scenario.

Upvotes: 1

Daniel W.
Daniel W.

Reputation: 32350

du is reporting space including filesystem space, not only file content size.

Also check for hidden files which might not be included in your du.

Upvotes: 0

Related Questions