richard
richard

Reputation: 14520

Replacing ’ character in PHP

I'm having a hard time trying to replace this weird right single quote character. I'm using str_replace like this:

str_replace("’", '\u1234', $string);

It looks like I cannot figure out what character the quote really is. Even when I copy paste it directly from PHPMyAdmin it still doesn't work. Do I have to escape it somehow?

The character: http://www.lukomon.com/Afbeelding%204.png

EDIT: It turned out to be a Microsoft left single quote which I could replace with this function from Phill Paffords comment. Not sure which answer I should mark now..

Upvotes: 5

Views: 29019

Answers (10)

Emanuel A.
Emanuel A.

Reputation: 112

You can get the char ascii code with ord then replace it with your desired character:

$asciicode = ord('’'); // 146
$stringfixed = str_replace(chr($asciicode), '\'', $string);

Upvotes: 0

David Kinkead
David Kinkead

Reputation: 181

I had the same issue and found this to work:

function replace_rsquote($haystack,$replacewith){
   $pos = strpos($haystack,chr("226"));
   if($pos > -1){
       return substr_replace($haystack,$replacewith,$pos,3);
   } else return $haystack;
}

Example:

echo replace_rsquote("Nick’s","'"); //Nick's

Upvotes: 4

kingjeffrey
kingjeffrey

Reputation: 15260

Don't use any regex functions ( preg_replace or mb_ereg_replace ). They are way to heavy for this.

str_replace(chr(226),'\u2019' , $string);

If your needle is a multibyte character, you may have better luck with this bespoke function:

<?php 
function mb_str_replace($needle, $replacement, $haystack) {
    $needle_len = mb_strlen($needle);
    $replacement_len = mb_strlen($replacement);
    $pos = mb_strpos($haystack, $needle);
    while ($pos !== false)
    {
        $haystack = mb_substr($haystack, 0, $pos) . $replacement
                . mb_substr($haystack, $pos + $needle_len);
        $pos = mb_strpos($haystack, $needle, $pos + $replacement_len);
    }
    return $haystack; 
} 
?>

credit for this last function: http://www.php.net/manual/en/ref.mbstring.php#86120

Upvotes: 0

arena-ru
arena-ru

Reputation: 1020

Gumbo sad right -
- save your script as utf-8 file
- and use http://php.net/mbstring (as Sarfraz pointed in his last example)

Upvotes: 1

Peter Bailey
Peter Bailey

Reputation: 105878

This character you have is the Right Single Quotation Mark.

To replace it with a pattern you'll want to do something like this

$string = preg_replace( "/\\x{2019}/u", 'replacement', $string );

But that really only addresses the symptom. The problem is that you don't have consistent use of character encodings throughout your application, as others have noted.

Upvotes: 0

Sarfraz
Sarfraz

Reputation: 382656

This had happend to me too. Couple of things:

  • Use htmlentities function for your text

    $my_text = htmlentities($string, ENT_QUOTES, 'UTF-8');

More info about the htmlentities function.

  • Use proper document type, this did the trick for me.

    <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">

  • Use utf-8 encoding type in your page:

    <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />

Here is the final prototype for your page:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<title>Untitled Document</title>
</head>    
<body>

<?php     
    // your code related to database        
    $my_text = htmlentities($string, ENT_QUOTES, 'UTF-8');    
?>

</body>
</html>

.

If you want to replace it however, try the mb_ereg_replace function.

Example:

mb_internal_encoding("UTF-8");
mb_regex_encoding("UTF-8");

$my_text = mb_ereg_replace("’","'", $string);

Upvotes: 8

Casey Chu
Casey Chu

Reputation: 25463

To find what character it is, run it through the ord function, which will give you the ASCII code of the character:

echo ord('’'); // 226

Now that you know what it is, you can do this:

str_replace('’', chr(226), $string);

Upvotes: 2

Pekka
Pekka

Reputation: 449395

To replace it:

If your script file is encoded in the same encoding as the data you are trying to do the replacement in, it should work the way you posted it. If you're working with UTF-8 data, make sure the script is encoded in UTF-8 and it's not your editor silently transliterating the character when you paste it.

If it won't work, try escaping it as described below and see what code it returns.

To escape it:

If your source file is encoded in UTF-8, this should work:

$string = htmlentities($string, ENT_QUOTES, "UTF-8");

the default character set of html... is iso-8859-1. Anything differing from that must be explicitly stated.

For more complex character conversion issues, always check out the User Contributed Notes to functions like htmlentities(), there are often real gems to be found there.

In General:

Bobince is right in his comment, systemic character set problems should be sorted systematically so they don't bite you in the ass - if only by defining which character set is used on every step of the way:

  • How the script file is encoded;
  • How the document is served;
  • How the data is stored in the database;
  • How the database connection is encoded.

Upvotes: 1

Gumbo
Gumbo

Reputation: 655189

If you are using non-ASCII characters in your PHP code, you need to make sure that you’re using the same character encoding as in the data you are processing. Your attempt probably fails because you are using a different character encoding in your PHP script than in $string.

Additionally, if you’re using a multibyte character encoding such as UTF-8, you should also use the multibyte aware string functions.

Upvotes: 1

user97410
user97410

Reputation: 724

Why not run the string through htmlspecialchars() and output it to see what it turns that character into, so you know what to use as your replace expression?

Upvotes: 0

Related Questions