Reputation: 19304
I have a C app that generates very large binary files, each about 30GB. After writing each file, computing an MD5 checksum of it takes a while, (a couple of minutes per file, approximately.)
How would I go about computing the MD5 checksum of the file as it is being written to disk? I figure by doing this I would at least save the additional overhead of re-reading the file to compute the checksum afterwards.
I'm using the C standard library to do all file IO, and the OS is Linux.
Can this be done? Thanks!
Upvotes: 4
Views: 1189
Reputation: 992707
This is certainly possible to do. Essentially, you initialise an MD5 calculation before you start writing. Then, whenever you write some data to disk, also pass that to the MD5 update function. After writing all the data, you call a final MD5 function to compute the final digest.
If you don't have any MD5 code handy, RFC 1321 has an MD5 reference implementation included that provides the above operations.
Upvotes: 5