Wally Lawless
Wally Lawless

Reputation: 7557

Why isn't PHP's htmlentities() converting the œ character?

I'm having some trouble with PHP's htmlentities() / htmlspecialchars() functions. The string I am converting contains the character œ (html equivalent is '&oelig'), yet both htmlentities() and htmlspecialchars() are not converting this character.

When I run get_html_translation_table(HTML_ENTITIES) to see the translation table that PHP is using, I noticed that the œ character is missing, yet other ligaments like æ (&aelig) are present. Why is this? Is there a different way that I'm supposed to convert the œ character?

For reference, I'm running PHP 5.3.14 and here's the output from get_html_translation_table(HTML_ENTITIES):

array(100) {
  [" "]=>
  string(6) " "
  ["¡"]=>
  string(7) "¡"
  ["¢"]=>
  string(6) "¢"
  ["£"]=>
  string(7) "£"
  ["¤"]=>
  string(8) "¤"
  ["¥"]=>
  string(5) "¥"
  ["¦"]=>
  string(8) "¦"
  ["§"]=>
  string(6) "§"
  ["¨"]=>
  string(5) "¨"
  ["©"]=>
  string(6) "©"
  ["ª"]=>
  string(6) "ª"
  ["«"]=>
  string(7) "«"
  ["¬"]=>
  string(5) "¬"
  ["­"]=>
  string(5) "­"
  ["®"]=>
  string(5) "®"
  ["¯"]=>
  string(6) "¯"
  ["°"]=>
  string(5) "°"
  ["±"]=>
  string(8) "±"
  ["²"]=>
  string(6) "²"
  ["³"]=>
  string(6) "³"
  ["´"]=>
  string(7) "´"
  ["µ"]=>
  string(7) "µ"
  ["¶"]=>
  string(6) "¶"
  ["·"]=>
  string(8) "·"
  ["¸"]=>
  string(7) "¸"
  ["¹"]=>
  string(6) "¹"
  ["º"]=>
  string(6) "º"
  ["»"]=>
  string(7) "»"
  ["¼"]=>
  string(8) "¼"
  ["½"]=>
  string(8) "½"
  ["¾"]=>
  string(8) "¾"
  ["¿"]=>
  string(8) "¿"
  ["À"]=>
  string(8) "À"
  ["Á"]=>
  string(8) "Á"
  ["Â"]=>
  string(7) "Â"
  ["Ã"]=>
  string(8) "Ã"
  ["Ä"]=>
  string(6) "Ä"
  ["Å"]=>
  string(7) "Å"
  ["Æ"]=>
  string(7) "Æ"
  ["Ç"]=>
  string(8) "Ç"
  ["È"]=>
  string(8) "È"
  ["É"]=>
  string(8) "É"
  ["Ê"]=>
  string(7) "Ê"
  ["Ë"]=>
  string(6) "Ë"
  ["Ì"]=>
  string(8) "Ì"
  ["Í"]=>
  string(8) "Í"
  ["Î"]=>
  string(7) "Î"
  ["Ï"]=>
  string(6) "Ï"
  ["Ð"]=>
  string(5) "Ð"
  ["Ñ"]=>
  string(8) "Ñ"
  ["Ò"]=>
  string(8) "Ò"
  ["Ó"]=>
  string(8) "Ó"
  ["Ô"]=>
  string(7) "Ô"
  ["Õ"]=>
  string(8) "Õ"
  ["Ö"]=>
  string(6) "Ö"
  ["×"]=>
  string(7) "×"
  ["Ø"]=>
  string(8) "Ø"
  ["Ù"]=>
  string(8) "Ù"
  ["Ú"]=>
  string(8) "Ú"
  ["Û"]=>
  string(7) "Û"
  ["Ü"]=>
  string(6) "Ü"
  ["Ý"]=>
  string(8) "Ý"
  ["Þ"]=>
  string(7) "Þ"
  ["ß"]=>
  string(7) "ß"
  ["à"]=>
  string(8) "à"
  ["á"]=>
  string(8) "á"
  ["â"]=>
  string(7) "â"
  ["ã"]=>
  string(8) "ã"
  ["ä"]=>
  string(6) "ä"
  ["å"]=>
  string(7) "å"
  ["æ"]=>
  string(7) "æ"
  ["ç"]=>
  string(8) "ç"
  ["è"]=>
  string(8) "è"
  ["é"]=>
  string(8) "é"
  ["ê"]=>
  string(7) "ê"
  ["ë"]=>
  string(6) "ë"
  ["ì"]=>
  string(8) "ì"
  ["í"]=>
  string(8) "í"
  ["î"]=>
  string(7) "î"
  ["ï"]=>
  string(6) "ï"
  ["ð"]=>
  string(5) "ð"
  ["ñ"]=>
  string(8) "ñ"
  ["ò"]=>
  string(8) "ò"
  ["ó"]=>
  string(8) "ó"
  ["ô"]=>
  string(7) "ô"
  ["õ"]=>
  string(8) "õ"
  ["ö"]=>
  string(6) "ö"
  ["÷"]=>
  string(8) "÷"
  ["ø"]=>
  string(8) "ø"
  ["ù"]=>
  string(8) "ù"
  ["ú"]=>
  string(8) "ú"
  ["û"]=>
  string(7) "û"
  ["ü"]=>
  string(6) "ü"
  ["ý"]=>
  string(8) "ý"
  ["þ"]=>
  string(7) "þ"
  ["ÿ"]=>
  string(6) "ÿ"
  ["&"]=>
  string(5) "&"
  ["""]=>
  string(6) """
  ["<"]=>
  string(4) "&lt;"
  [">"]=>
  string(4) "&gt;"
}

Upvotes: 2

Views: 2575

Answers (1)

Hugo Delsing
Hugo Delsing

Reputation: 14173

I tried using PHP 5.4.5 and it outputs &oelig correctly. So i cant realy test this, but I guess its because isnt in the actual charset for iso-8859-1 which is used by default. They are in the supplementary character Set. Try using ISO-8859-15

htmlentities($s,  ENT_COMPAT | ENT_HTML401, "ISO-8859-15");

Upvotes: 2

Related Questions