CJD
CJD

Reputation: 195

file compressed through command "pv" are different from ordinary compressed file

here is my script:

tar cf - testdir | pv -s $(du -sb testdir | awk '{print $1}') | pigz -1 > pv.tar.gz

tar cf - testdir | pigz -1 > nopv.tar.gz

diff pv.tar.gz nopv.tar.gz

and then the output is "Binary files pv.tar.gz and nopv.tar.gz differ".

I execute hexdump

and I found that only the first line of these two files is slightly different

pv.tar.gz: 8b1f 0008 9e24 5fc8 0304 bdec 5f7b c71b

nopv.tar.gz: 8b1f 0008 9c18 5fc8 0304 bdec 5f7b c71b

But after I unzipped it and compared it again, the testdir is exactly the same.

What I want to ask is, how can I make the two tar.gz files consistent?

Upvotes: 2

Views: 510

Answers (1)

seumasmac
seumasmac

Reputation: 2774

It's not to do with pv. Bytes 5 to 8 in a gzip header are the timestamp. This will be different each time you run the command. You can tell pigz not to store it with the -m switch, so your commands are:

tar cf - testdir | pv -s $(du -sb testdir | awk '{print $1}') | pigz -1 -m > pv.tar.gz

tar cf - testdir | pigz -1 -m > nopv.tar.gz

which should give you the same content. You'll notice when you hexdump that the values that changed are all 00 now.

Upvotes: 3

Related Questions