Anonymous
Anonymous

Reputation: 12027

PHP is Counting Some Characters as 3 Characters

I am trying to insert text in between special characters, but the problem is that php interprets some special characters as 3 characters for some unknown reason. For example if you were to use strlen() on any of the following symbols, it would return 3:

➊➜❚✶➪

Therefore, I need a way to figure out how to add something in between each special character of a string. For example, if the string were:

TE➊➜❚S✶T➪

The ideal output would be:

|T|E|➊|➜|❚|S|✶|T|➪|

I have tried using this:

<?php
$string = 'TE➊➜❚S✶T➪';
$array = str_split($string);
foreach ($array as $char) {
    $newstring .= '|'.$char;
}
$newstring .= '|';
echo $newstring;
?>

However, since php interprets the special characters as 3 characters, it inserts the tag in between each single character of the three-character symbol which causes the output appear like this:

|T|E|â|ž|Š|â|ž|œ|â||š|S|â|œ|¶|T|â|ž|ª|

Therefore it is changing the symbols like this:

➊ => ➊
➜ => âžœ
❚ => âš
✶ => ✶
➪ => ➪

And setting each single character as an element of the array.

Question: Is there any way to count such symbols as one character when splitting a string per character in order to insert something in between?

What I have tried:

  1. Encoding in UTF-8
  2. Encoding in UTF-8 without BOM
  3. Using htmlspecialchars()
  4. Using htmlspecialchars_decode()
  5. Using htmlentities()
  6. Using html_entity_decode()

All of which made absolutely no change.
Is there any way to do this? Thanks.

Upvotes: 1

Views: 254

Answers (3)

Rob
Rob

Reputation: 224

one thing that is missing is joining the array into the appropriate string you like. So you can make this change to get your desired string.

$array = preg_split('//u', $s);
print_r($array);
$ss = implode('|', $array);

Upvotes: 0

Sharanya Dutta
Sharanya Dutta

Reputation: 4021

The function str_split works with single-byte strings only. If you need to split a multibyte string, use preg_split with the u modifier.

Replace

$array = str_split($string);

with

$array = preg_split('//u', $string, -1, PREG_SPLIT_NO_EMPTY);

Upvotes: 2

Wrikken
Wrikken

Reputation: 70540

Use the mbstring functions, tell it you are using UTF-8. Also, htmlspecialchars() and the like have a charset argument: if you're not using ISO-8859-1, and your PHP version is lower then 5.4, you MUST set it to the correct one.

Upvotes: 1

Related Questions