Ivan
Ivan

Reputation: 15912

How to remove   from a UTF-8 string?

My database is returning some strings like:

This is a string

This is a problem when the string is long enough and you have maximum width set:

<p style="width:50px">This&nbsp;is&nbsp;a&nbsp;string</p>

In order to get ride of &nbsp; entities I've tried to use the following filters without success:

$new = preg_replace("/&nbsp;/i", " ", $str);
$new = str_replace('&nbsp;', ' ', $str);
$new = html_entity_decode($str);

You have a PHP fiddle to see this in action (I've had to codify the string in hex from the database output; the string is in spanish, sorry).

How to deal with this? Why html_entity_decode() is not working? And what about the replace functions? Thanks.

Upvotes: 10

Views: 28820

Answers (2)

miles_monroe
miles_monroe

Reputation: 39

Get the html entities replace the one you want and decode back:

$str = str_replace('&nbsp;', ' ', htmlentities($new));
$new = html_entity_decode($str);

Upvotes: 3

unixmiah
unixmiah

Reputation: 3145

This gets tricky, its not as straight forward as replacing normal string.

Try this.

 str_replace("\xc2\xa0",' ',$str); 

or this, the above should work:

$nbsp = html_entity_decode("&nbsp;");
$s = html_entity_decode("[&nbsp;]");
$s = str_replace($nbsp, " ", $s);
echo $s;

@ref: https://moovwebconfluence.atlassian.net/wiki/pages/viewpage.action?pageId=1081435

Upvotes: 27

Related Questions