Manjinder Aulakh
Manjinder Aulakh

Reputation: 208

Synchronizing Files

I have a project in which a file has to be synchronized across computers.

My problem is that my program gives an error that it runs out of execution time of 30 sec.

Now, i have made a program in php for this; what it does is divides old file into blocks and makes md5 hashes of each block and compares it with modified file by dynamically making hashes of given length at any offset.(from starting till the end of modified file) And this way it finds the blocks which need not be transferred.

Any one out there has any experience,advice,links or code your more than welcome.Thnx

p.s i have the luxury to work in php, java or c++.

the code i'm giving is for testing purpose, it takes 2 files from same location(one modified file and the other the original) makes hashes of blocks from old file and compares it with hashes from new file at every other offset. hope this helps:

<html>
<body>
<?php  
   $k=0;
   $old_file = file_get_contents('11.jpg');
   $new_file = file_get_contents('12.jpg'); 
   $block_length = 2048;
   $j = 0;
   $md5_hashes_old = array();
   $md5_hashes_new = array();
   $diff_blocks = array(); 
   $first_char=array();
   $k = 0;
   while(1){
     if($j >strlen($old_file))
     break;
     $block = substr($old_file,$j,$block_length);
     $md5_hashes_old[$k] = md5($block);
     $first_char[$k]=$block[0];
     $j = $j+$block_length;
     $k++;
  } 
   $j = 0;
   $k = 0;
   $no_of_blocks = sizeof($md5_hashes_old);
   echo $no_of_blocks;
   $matched_blocks = array();
   $matched = 0;
   $fc=0;
   echo $md5_hashes_old[1].'</br>';
  for($i=0;$i<$no_of_blocks;$i++){
      $j =0;
      while(1){
    $block = substr($new_file,$j,$block_length);
    $md5_hash = md5($block);
    if($md5_hashes_old[$i] == $md5_hash){
        $match_block = array();
        $match_block['block_no'] = $i;
        $match_block['index'] = $j;
        array_push($matched_blocks,$match_block);
        break;
    }   
    else
        $j++;

    if($j > strlen($new_file))
        break;
    echo 'old='.$md5_hashes_old[$i].' i='.$i.' new='.$md5_hash.'</br>';
}       
}       
print_r($matched_blocks);   
?> 

</body>
</html>

Upvotes: 1

Views: 126

Answers (2)

Phil Hannent
Phil Hannent

Reputation: 12317

Increasing the time out is your first port of call.

I assume you are only doing the md5 comparison when you have a more recent modified date and the file length is different.

If you were using C++ you could use file system watchers to be notified when files are modified and then use that to trigger your process or to trigger the creation of the hash.

Another trick would be to cache files for making a binary diff:

http://dev.chromium.org/developers/design-documents/software-updates-courgette

Upvotes: 1

oscarmlage
oscarmlage

Reputation: 1455

You always can apply the dirty trick:

<?php  set_time_limit(9999);  ?>

But I'm agree with @aioobe, sounds like a reinvented rsync.

Upvotes: 0

Related Questions