Chameron
Chameron

Reputation: 2974

php - htmlspecialchars with unicode

    $string = "Główny folder grafik<p>asd nc</p>";

echo htmlspecialchars($string);

on live site

G&#322;ówny folder grafik<p>asd nc</p>

on local

Główny folder grafik<p>asd nc</p>

what is problem ? i want when run on live site result look like local

Upvotes: 3

Views: 1774

Answers (4)

rubo77
rubo77

Reputation: 20835

If you require all strings that have associated named entities to be translated, use htmlentities() instead, that function is identical to htmlspecialchars() in all ways, except with htmlentities(), all characters which have HTML character entity equivalents are translated into these entities.

but even htmlentities() does not encode all unicode characters. It encodes what it can [all of latin1], and the others slip through (e.g. `Љ).

This function consults an ansii table to custom include/omit chars you want/don't.

(note: sure it's not that fast)

/**
 * Unicode-proof htmlentities.
 * Returns 'normal' chars as chars and weirdos as numeric html entites.
 * @param  string $str input string
 * @return string      encoded output
 */
function superentities( $str ){
    // get rid of existing entities else double-escape
    $str = html_entity_decode(stripslashes($str),ENT_QUOTES,'UTF-8');
    $ar = preg_split('/(?<!^)(?!$)/u', $str );  // return array of every multi-byte character
    foreach ($ar as $c){
        $o = ord($c);
        if ( (strlen($c) > 1) || /* multi-byte [unicode] */
            ($o <32 || $o > 126) || /* <- control / latin weirdos -> */
            ($o >33 && $o < 40) ||/* quotes + ambersand */
            ($o >59 && $o < 63) /* html */
        ) {
            // convert to numeric entity
            $c = mb_encode_numericentity($c,array (0x0, 0xffff, 0, 0xffff), 'UTF-8');
        }
        $str2 .= $c;
    }
    return $str2;
}

Upvotes: 0

Christophe
Christophe

Reputation: 4828

You need to add extra parameters to the htmlspecialchars() function. The following should work:

htmlspecialchars($string, ENT_QUOTES, "UTF-8");

Upvotes: 1

fabrik
fabrik

Reputation: 14365

You may want to pass an optional parameter to htmlspecialchars about charset which is ISO-8859-1 by default.

Upvotes: 0

Pascal MARTIN
Pascal MARTIN

Reputation: 400992

htmlspecialchars() accepts additional parameters -- the third one being the charset.

Try specifying that third parameter.

Upvotes: 1

Related Questions