Reputation: 23907
I'm trying to troubleshoot an issue with some (apparently) mangled serialized data in a MySQL database, after a conversion to UTF-8. When I try to unserialize them, I get the usual:
Notice: unserialize() [function.unserialize]: Error at offset 1481 of 255200 bytes [...]
However, given that this is a multi-byte string, I can't figure out how to find which character is at that byte offset. What I need is something like substr()
but for bytes, instead of characters. How can I do that?
Thanks in advance.
Upvotes: 4
Views: 2013
Reputation: 3541
You have to do a substr($str, 1481, 2);
, substr($str, 1481, 3);
or substr($str, 1481, 4);
. If it's an UTF-8 you'll find it in any of thos 3 substrings, because an UTF-8 char may take from 2 to 4 chars, depending on the first char.
I've had a lot of problems with this, so if you can't find what's going on with the encoding, answer again :-) I'll try to lend you a hand.
Good luck!
Edit: Don't forget to do a header("Content-type: text/html;charset=utf8"); to watch the result properly.
Upvotes: 3
Reputation: 655775
substr
does work on bytes instead of characters. So this should return the 1481st byte:
substr($data, 1481, 1)
Upvotes: 0