Munib
Munib

Reputation: 3701

Is there any solution for unicode numeric symbols conversion to actual characters

I am about to pull my hairs on this issue. If some one has any solution. I have an html string

$html = '<div id="main">What is going on </div><div>&#1740;&#1729;&#1575;&#1722; 
&#1578;&#1608; &#1705;&#1608;&#1574;&#1740; &#1729</div>
<span>Some More Text &lt;good&gt;</span>;

This is the mixed html string having html entities + english characters + numeric symbols of unicode characters. I want to convert only the numeric symbols of unicode characters to actual unicode character values. There is also user formatting that I do not want to lose.

I want the following output

$html = '<div id="main">What is going on </div><div>‘۔سلطان محمود نے گاڑی روکتے ہوئے</div>
<span>Some More Text &lt;good&gt;</span>;

I have used the

html_entity_decode($html, ENT_COMPAT, 'utf-8');

but this also converts the &lt; to < and &gt; to > that I do not want.

Any Other solution??

Note: I am not asking that unicode characters are not being shown correctly on my webpage, they are shown well. because the webpage renders the numeric symbols and shows as real unicode characters. But I want the actaul unicode characters at the back of the webpage too.

Upvotes: 1

Views: 108

Answers (1)

SWilk
SWilk

Reputation: 3486

Try using preg_preplace_callback with html_entity_decode as callback.

$decode_single_entity = function ($matches) {
    return html_entity_decode($matches[0], ENT_COMPAT, 'utf-8');
};
$string = preg_replace_callback('/&#\d+;/', $decode_single_entity, $html);

Upvotes: 1

Related Questions