Ken
Ken

Reputation: 4163

parallel check md5 file

I have a md5sum file containing lots of lines. I want to use GNU parallel to accelerate the md5sum checking process. In the md5sum, when no file input, it will take the md5 string from stdin. I tried this:

cat checksums.md5 | parallel md5sum -c {}

But getting this error:

md5sum 445350b414a8031d9dd6b1e68a6f2367 testing.gz: No such file or directory

How can I parallel the md5sum checking?

Upvotes: 5

Views: 3494

Answers (2)

Ole Tange
Ole Tange

Reputation: 33685

Assuming checksums.md5 has the format:

d41d8cd98f00b204e9800998ecf8427e  My file name

Run:

cat checksums.md5 | parallel --pipe -N1 md5sum -c

If your files are small: -N100

If that does not speed up your processing make sure your disks are fast enough: md5sum can process 500 MB/s. iostat -dkx 1 can tell you if your disks are a bottleneck.

Upvotes: 12

Andrey
Andrey

Reputation: 2583

You need option --pipe. In this mode parallel splits stdin into blocks and supplies each block to the command via stdin, see man parallel for details:

cat checksums.md5 | parallel --pipe md5sum -c -

By default size of the block is 1 MB, can be changed with --block option.

Upvotes: 1

Related Questions