Reputation: 18113
Ok i have this:
a:1:{i:0;a:3:{s:7:"address";s:52:"Elågåresgude 41, 2200 Københamm N";s:12:"company_name";s:14:"Kaffe og Kluns";s:9:"telephone";s:0:"";}}
This does not work with unserialize($string);
I know where the error is. It's the number in front os the address. It should not be 52, but 36.
I got to this number by counting the string (which gave me 33) and then plus with 1 on each å or ø that exists in the string.
When i replace 52 with 36, will it unseralize just fine.
Now i would like to write a script to do this for all my addresses.
But how can i even do this? Extract the address/company_name/telephone string, when its "corrupted"?
Upvotes: 1
Views: 2143
Reputation: 48031
This problem is a classic case of someone trying to perform a shortcut when updating a value in a serialized string. The lesson swiftly learned to avoid this headache is to unserialize your data, modify your value(s), then re-serialize it.
I feel regular expressions afford a more direct approach for trying to parse the corrupted serialized string. To be perfectly clear, my snippet will only update the byte/character counts; if you have a serialized string that is corrupted by some other means, this will not be the remedy.
Here is a simple preg_replace_callback()
call that only captures the value substring and unconditionally replaces all byte counts in the serialized string:
Code: (Demo)
$corrupted_byte_counts = <<<STRING
a:1:{i:0;a:3:{s:7:"address";s:52:"Elågåresgude 41, 2200 Københamm N";s:12:"company_name";s:14:"Kaffe og Kluns";s:9:"telephone";s:0:"";}}
STRING;
$repaired = preg_replace_callback(
'/s:\d+:"(.*?)";/s',
function ($m) {
return 's:' . strlen($m[1]) . ":\"{$m[1]}\";";
},
$corrupted_byte_counts
);
echo "corrupted serialized array:\n$corrupted_byte_counts";
echo "\n---\n";
echo "repaired serialized array:\n$repaired";
echo "\n---\n";
print_r(unserialize($repaired));
Output:
corrupted serialized array:
a:1:{i:0;a:3:{s:7:"address";s:52:"Elågåresgude 41, 2200 Københamm N";s:12:"company_name";s:14:"Kaffe og Kluns";s:9:"telephone";s:0:"";}}
---
repaired serialized array:
a:1:{i:0;a:3:{s:7:"address";s:36:"Elågåresgude 41, 2200 Københamm N";s:12:"company_name";s:14:"Kaffe og Kluns";s:9:"telephone";s:0:"";}}
---
Array
(
[0] => Array
(
[address] => Elågåresgude 41, 2200 Københamm N
[company_name] => Kaffe og Kluns
[telephone] =>
)
)
I've even gone a bit further to address a possible fringe case. Without implementing the pattern extension in that link, the above snippet will work as desired on strings with:
It only breaks when a string to be matched contains ";
-- in which case, my above link attempts to address that possibility.
Upvotes: 0
Reputation: 15159
Looks like a bug in the function in dealing with multi-byte characters. You might also want to try explicitly encoding the string as utf-8 before serializing it.
As a workaround, you could base64 encode the address before serializing it, then base64 decode it when you unserialize it.
Upvotes: 0
Reputation: 16943
function fix_corrupted_serialized_string($string) {
$tmp = explode(':"', $string);
$length = count($tmp);
for($i = 1; $i < $length; $i++) {
list($string) = explode('"', $tmp[$i]);
$str_length = strlen($string);
$tmp2 = explode(':', $tmp[$i-1]);
$last = count($tmp2) - 1;
$tmp2[$last] = $str_length;
$tmp[$i-1] = join(':', $tmp2);
}
return join(':"', $tmp);
}
working demo: http://codepad.viper-7.com/GNbM25
Upvotes: 4
Reputation: 1786
I think one solution should be to test if unserialize worked. If not, delete it and reserialize it.
$yourserializestring = '...';
$data = @unserialize($yourserializestring);
if ($yourserializestring === 'b:0;' || $data !== false) {
// Something didn't work, you should recreate it
} else {
echo "ok";
}
Upvotes: -1