Reputation: 35478
I am looking for smart and efficient ways that will check if two files are completely same.
The program will loop all the folders recursively and they may include very large files.
So I decided to use incremental checks
md5 hash check
to decide.This would pretty much do it already. But, I wonder what are the other options that would sufficiently fast?
Upvotes: 0
Views: 216
Reputation: 14057
I can not think of many other options available to you.
Remember that the md5 hash check (or any other calculation) is really only useful if you have a pre-existing md5 hash check (or some other calcuation) and you want to be reasonably assured that the file has not changed since your pre-existing calculation was last done.
Other things to use for reasonable assurity (using pre-existing calculations) are ...
1. Inode and mount point IDs from the stat() family.
2. mtime comparisons for info on when the file was last modified.
Otherwise, you are left with doing a straight byte-by-byte comparison between two files.
Upvotes: 1