Santhanakumar
Santhanakumar

Reputation: 382

md5sum different values for the same content

I want to compare two files to check whether the second file is modified from the first file.

For this implementation I have planned to compare the md5_file() of the both files. But the problem is the original file is created by the Unix line coding and second file might be any type of line coding (Unix, Mac or Windows). So the file compare always fails. How to solve this issue?.

I have tried to remove the white spaces from the both files then proceeded the comparison. But this method also fails. Is there any other way to solve issue?

Im not supposed to copy or change the second file.

Fixed Myself as follows

$file1 = md5(preg_replace('/\s/', '', file_get_contents($file1)));
$file2 = md5(preg_replace('/\s/', '', file_get_contents($file2)));

if ($file1 == $file2)
    continue;

Upvotes: 0

Views: 604

Answers (2)

dognose
dognose

Reputation: 20899

Depending on how big the files are, you could just read them into strings, taking the encoding into account, and then md5 those strings.

  $file1 = file_get_contents($file_url_1);
  $file2 = file_get_contents($file_url_2);

  $file1 = mb_convert_encoding($file1, "UTF-8", "whateverEncoding");
  $file2 = mb_convert_encoding($file2, "UTF-8", "whateverOtherEncoding");

  if (md5($file1) == md5($file2))

  ....

Upvotes: 1

John V.
John V.

Reputation: 4670

Simply replace all of the line endings in the second file with the unix style, but only do it to a temp file or such so you can preserve the original.

Upvotes: 1

Related Questions