Reputation: 714
I need to split Unicode code string into array by 70 characters. So the values in the result array will contain 70 characters long strings. The following is my code
$msg = preg_replace('/[\r\n]+/', ' ', $smsContent);
$chunks = wordwrap($msg, 70, '\n');
$chunks = explode('\n', $chunks);
//print_r($chunks);
But the result array contains value with different length.
Here is an example
$smsContent = "सभी मनुष्यों कोगौरव और अधिकारों के मामले में जनजात स्वतंत्रता और समानता प्राप्त है | उन्हें बुद्धि और अन्तरात्मा कि देन प्राप्त है |";
result :
Array
(
[0] => सभी मनुष्यों कोगौरव और अधि
[1] => कारों के मामले में जनजात स�
[2] => �वतंत्रता और समानता प्राप्
[3] => त है | उन्हें बुद्धि और अन्त
[4] => रात्मा कि देन प्राप्त है |
)
I need to split it into 70 characters long values, but it seems to be not correct. And also I need to prevent words from splitting.
Upvotes: 0
Views: 186
Reputation: 89584
You can't use an approach based on the number of bytes because your string contains multibyte characters and eventually combining characters. You have to work by glyph. It's possible to do that using the character classes [:graph:]
and [:print:]
:
preg_match_all('~[[:graph:]][[:print:]]{0,30}(?!\S)~u', $smsContent, $m);
print_r($m[0]);
You can also try to play with the grapheme functions from intl.
Upvotes: 1
Reputation: 1267
You have to use str_split()
function :
$smsContent = "सभी मनुष्यों कोगौरव और अधिकारों के मामले में जनजात स्वतंत्रता और समानता प्राप्त है | उन्हें बुद्धि और अन्तरात्मा कि देन प्राप्त है |";
$output = str_split($smsContent, 70);
print_r($output);
Upvotes: -1