Reputation: 208
I have a project in which a file has to be synchronized across computers.
My problem is that my program gives an error that it runs out of execution time of 30 sec.
Now, i have made a program in php for this; what it does is divides old file into blocks and makes md5 hashes of each block and compares it with modified file by dynamically making hashes of given length at any offset.(from starting till the end of modified file) And this way it finds the blocks which need not be transferred.
Any one out there has any experience,advice,links or code your more than welcome.Thnx
p.s i have the luxury to work in php, java or c++.
the code i'm giving is for testing purpose, it takes 2 files from same location(one modified file and the other the original) makes hashes of blocks from old file and compares it with hashes from new file at every other offset. hope this helps:
<html>
<body>
<?php
$k=0;
$old_file = file_get_contents('11.jpg');
$new_file = file_get_contents('12.jpg');
$block_length = 2048;
$j = 0;
$md5_hashes_old = array();
$md5_hashes_new = array();
$diff_blocks = array();
$first_char=array();
$k = 0;
while(1){
if($j >strlen($old_file))
break;
$block = substr($old_file,$j,$block_length);
$md5_hashes_old[$k] = md5($block);
$first_char[$k]=$block[0];
$j = $j+$block_length;
$k++;
}
$j = 0;
$k = 0;
$no_of_blocks = sizeof($md5_hashes_old);
echo $no_of_blocks;
$matched_blocks = array();
$matched = 0;
$fc=0;
echo $md5_hashes_old[1].'</br>';
for($i=0;$i<$no_of_blocks;$i++){
$j =0;
while(1){
$block = substr($new_file,$j,$block_length);
$md5_hash = md5($block);
if($md5_hashes_old[$i] == $md5_hash){
$match_block = array();
$match_block['block_no'] = $i;
$match_block['index'] = $j;
array_push($matched_blocks,$match_block);
break;
}
else
$j++;
if($j > strlen($new_file))
break;
echo 'old='.$md5_hashes_old[$i].' i='.$i.' new='.$md5_hash.'</br>';
}
}
print_r($matched_blocks);
?>
</body>
</html>
Upvotes: 1
Views: 126
Reputation: 12317
Increasing the time out is your first port of call.
I assume you are only doing the md5 comparison when you have a more recent modified date and the file length is different.
If you were using C++ you could use file system watchers to be notified when files are modified and then use that to trigger your process or to trigger the creation of the hash.
Another trick would be to cache files for making a binary diff:
http://dev.chromium.org/developers/design-documents/software-updates-courgette
Upvotes: 1
Reputation: 1455
You always can apply the dirty trick:
<?php set_time_limit(9999); ?>
But I'm agree with @aioobe, sounds like a reinvented rsync
.
Upvotes: 0