Aunnoy
Aunnoy

Reputation: 37

remove bengali diacritics unicode php

Is there any way which prints the vowels of Bengali without the circle. I found a link which says printing the vowels by concatenating NBSP to it should work. It does, but not for the vowels which precedes a consonant (e.g. ো ে ি). I could not attach an image since I am new to this site. If anyone wants a visual representation of my question please let me know your email address, I will send you an email. Thanks in advance.

Upvotes: 0

Views: 715

Answers (2)

Jukka K. Korpela
Jukka K. Korpela

Reputation: 201568

It is true that you should use a no-break space (NBSP) before a combining mark to show it in (apparent) isolation; this is specified in clause 7.9 Combining marks in the Unicode Standard, chapter 7 (the name of the chapter is misleading, since it has general information, too, in addition to dealing with European scripts). However, it depends on the rendering software and the font used whether this has the desired effect.

In an HTML document, a combination of, say, NBSP and U+09C7 BENGALI VOWEL SIGN E is shown as blank in Chrome. This is an odd bug, of course. On IE and Firefox, you mostly get the rendering with a dotted circle, apparently because the browser does not want to apply the combining mark to a base character from a different font. If you use, say,  ে as such with no styling, then browsers typically pick up the no-break space from the Times New Roman and the Bengali character from another font, such as Vrinda. You can fix this by setting the font of the no-break space to the same as the Bengali character, e.g.

<p style="font-family: Vrinda">&nbsp;&#x9c7;
<p style="font-family: Sun-ExtA">&nbsp;&#x9c7;
<p style="font-family: Nirmala UI">&nbsp;&#x9c7;
<p style="font-family: FreeSerif">&nbsp;&#x9c7;
<p style="font-family: Code2000">&nbsp;&#x9c7;
<p style="font-family: Arial Unicode MS">&nbsp;&#x9c7;
<p style="font-family: ALPHA-Demo">&nbsp;&#x9c7;

In practice, you would this use a font-family value that is a suitable list of fonts. Of course, this would not work in computers that have none of the fonts listed. And it won’t work in Chrome (or in Opera, which displays the symbol with a dotted circle).

The conclusion is that unless you are targeting a specific audience with known browsers and fonts, you should probably present the characters as images.

Upvotes: 2

Jukka K. Korpela
Jukka K. Korpela

Reputation: 201568

If I understand the question correctly, it is about writing Bengali letters, not related to PHP or the web in general, in any particular way. As the Bengali vowel signs combine with consonants, it seems that if you want to use independent vowel signs, you should use characters like U+0993 BENGALI LETTER O “ও”.

Upvotes: 0

Related Questions