Reputation: 99
Have problem with preg split and utf. This is code:
$original['words'] = preg_split("/[\s]+/", $original['text']);
print_r($original);
This is answer:
Array
(
[text] => Šios baterijos kaista
[words] => Array
(
[0] => �
[1] => ios
[2] => baterijos
[3] => kaista
This code is runing in CakePHP framework. Make a notice that [text] is showed correctly before words and is messed in split progress. By the way, I tried using these one:
mb_internal_encoding( 'UTF-8');
mb_regex_encoding( 'UTF-8');
ini_set('default_charset','utf-8');
None helped. Thank you.
Upvotes: 5
Views: 4567
Reputation: 1170
$original = mb_split("[\s]+", 'Šios baterijos kaista');
print_r($original);
Result:
Array
(
[0] => Šios
[1] => baterijos
[2] => kaista
)
Note:
1) Don't forget to remove the leading and trailing '/' from the regex pattern when using mb_split.
2) Only works if the mbstring extension is enabled.
Upvotes: 0