Brandon Fredericksen
Brandon Fredericksen

Reputation: 1168

convert special characters to html code with php

I need a function that will clean a strings' special characters. I do NOT want this to convert HTML characters like <br /> to &lt;br /&gt;

I want to convert things like: •, ½, ’ to html code.

This is the function I currently use, but it doesn't appear to work with the fractions..

function cleanText($str){

$str = str_replace("Ñ" ,"&#209;", $str);
$str = str_replace("ñ" ,"&#241;", $str);
$str = str_replace("ñ" ,"&#241;", $str);
$str = str_replace("Á","&#193;", $str);
$str = str_replace("á","&#225;", $str);
$str = str_replace("É","&#201;", $str);
$str = str_replace("é","&#233;", $str);
$str = str_replace("ú","&#250;", $str);
$str = str_replace("ù","&#249;", $str);
$str = str_replace("Í","&#205;", $str);
$str = str_replace("í","&#237;", $str);
$str = str_replace("Ó","&#211;", $str);
$str = str_replace("ó","&#243;", $str);
$str = str_replace("“","&#8220;", $str);
$str = str_replace("”","&#8221;", $str);

$str = str_replace("‘","&#8216;", $str);
$str = str_replace("’","&#8217;", $str);
$str = str_replace("—","&#8212;", $str);

$str = str_replace("–","&#8211;", $str);
$str = str_replace("™","&trade;", $str);
$str = str_replace("ü","&#252;", $str);
$str = str_replace("Ü","&#220;", $str);
$str = str_replace("Ê","&#202;", $str);
$str = str_replace("ê","&#238;", $str);
$str = str_replace("Ç","&#199;", $str);
$str = str_replace("ç","&#231;", $str);
$str = str_replace("È","&#200;", $str);
$str = str_replace("è","&#232;", $str);
$str = str_replace("•","&#149;" , $str);

$str = str_replace("¼","&#188;" , $str);
$str = str_replace("½","&#189;" , $str);
$str = str_replace("¾","&#190;" , $str);
$str = str_replace("½","&#189;" , $str);

return $str;

}

Upvotes: 2

Views: 12515

Answers (4)

Levi Morrison
Levi Morrison

Reputation: 19552

You can replace your entire function with htmlentities using the ENT_SUBSTITUTE attribute. It will perform much faster in addition to working correctly.

Note: ENT_SUBSTITUTE available as of PHP 5.4.

Upvotes: 4

David Nguyen
David Nguyen

Reputation: 8528

Try this, I've used this function to convert anything/everything to unicode:

class unicode_replace_entities {
public function UTF8entities($content="") {
    $contents = $this->unicode_string_to_array($content);
    $swap = "";
    $iCount = count($contents);
    for ($o=0;$o<$iCount;$o++) {
        $contents[$o] = $this->unicode_entity_replace($contents[$o]);
        $swap .= $contents[$o];
    }
    return mb_convert_encoding($swap, "UTF-8"); //not really necessary, but why not.
}
public function unicode_string_to_array( $string ) { //adjwilli
    $strlen = mb_strlen($string);
    while ($strlen) {
        $array[] = mb_substr( $string, 0, 1, "UTF-8" );
        $string = mb_substr( $string, 1, $strlen, "UTF-8" );
        $strlen = mb_strlen( $string );
    }
    return $array;
}
public function unicode_entity_replace($c) { //m. perez
    $h = ord($c{0});
    if ($h <= 0x7F) {
        return $c;
    } else if ($h < 0xC2) {
            return $c;
        }

    if ($h <= 0xDF) {
        $h = ($h & 0x1F) << 6 | (ord($c{1}) & 0x3F);
        $h = "&#" . $h . ";";
        return $h;
    } else if ($h <= 0xEF) {
            $h = ($h & 0x0F) << 12 | (ord($c{1}) & 0x3F) << 6 | (ord($c{2}) & 0x3F);
            $h = "&#" . $h . ";";
            return $h;
        } else if ($h <= 0xF4) {
            $h = ($h & 0x0F) << 18 | (ord($c{1}) & 0x3F) << 12 | (ord($c{2}) & 0x3F) << 6 | (ord($c{3}) & 0x3F);
            $h = "&#" . $h . ";";
            return $h;
        }
}
}

$oUnicodeReplace = new unicode_replace_entities();

$oUnicodeReplace->UTF8entities($string);

Mind you it will convert everything but it will take care of weird characters otherwise...not my own script but I have no idea where I found it either.

Upvotes: 2

Macmade
Macmade

Reputation: 53950

Guess it's time to take a look at the htmlentities PHP function, and its options.

Basically, you can replace your whole function with:

$str = htmlentities( $str );

It will be also a lot more efficient.

Be sure to take a look at the function's optional parameters, if you need special processing (especially ENT_SUBSTITUTE).

$str = htmlentities( $str, ENT_SUBSTITUTE );

Upvotes: 3

Related Questions