Reputation: 307
We're using Freetype to render individual characters to a texture atlas then rendering from this texture to the screen. However, when we render Arabic, the characters don't join up as they should. They all look like the individual characters, placed next to each other. If we put the same characters into Notepad, for example they do join up, but then if we put a space between each Arabic character they separate and look like our rendering again. If in Notepad we remove the spaces, the characters either side of the removed space "change" and join together. It's obvious that the combination of characters placed together change how they appear.
So, how can we achieve joined up words in Arabic with freetype ? Can we print the entire "word" to a bitmap using Freetype and it will automatically adjust the glyphs to join together, or is there some sort of translation we can apply to the list of utf8 characters which converts them into new utf8 characters which do join up when placed next to each other ?
Thanks
Shaun
Upvotes: 2
Views: 2212
Reputation: 301
It is indeed possible to do a simplified implementation of arabic text rendering.
First you need to split the UTF8 text into Unicode characters. For more information: https://en.wikipedia.org/wiki/UTF-8
Once you have your Unicode sequence, you need to determine the display order. For simple arabic text, you can assume it is from right to left and when you encounter left-to-right text or digits, you need to switch direction. The direction of a character depends on its Bidi class, which is column 5 in UnicodeData.txt of Unicode database: https://www.unicode.org/reports/tr44/
The general purpose Bidi algorithm is not trivial, in particular because one can insert Unicode characters to embed left-to-right text for example. All is explained in detail there: http://www.unicode.org/reports/tr9/
Afterwards, for arabic text, you need to determine the ligatures. Letters join together depending on their joining type, defined in column 3 of ArabicShaping.txt (in Unicode database). For example if you have a letter of type L (left joining) at the right of a letter of type R (right joining) then they will join together.
The algorithm is simple: for each arabic letter, determine its joining type. If it can join, then try to look for a letter to the left and to the right and check that their joining type is complementary.
Note that they may be characters that are not arabic letters. The Bidi class mentionned earlier can help you dertermine the joining type of a character if it is not mentionned in ArabicShaping.txt. If the Bidi class is NSM (non-spacing mark) or if the general class is Cf, then the joining type is T (transparent), otherwise it is U (non-joining).
When checking right and left, skip characters of type T, until you find another type or reach the end of the text. Note that if you implement the Bidi algorithm, then you need to stop at the end of a Bidi isolate.
Once you know whether the character joins, you can determine its presentation form: initial, medial, final, isolated. In UnicodeData.txt, search for the code point for it. It will be the character (column 1) that corresponds to this presentation (column 6). For example "<initial> 067B" is presented as character FB54. You can thus replace it.
There is a special case for Lam followed by Alef. In this case, characters are merged together. For example "<isolated> 0644 0622" is presented as character FEF5. Note that you may encounter non spacing marks in between. If you don't handle them, you can discard them. Otherwise, you can keep the information for later.
At this stage, you can convert back the array of Unicode characters to a UTF8 string and draw it with FreeType. Note that non spacing marks will not be placed correctly. To do that, you need to draw each character separately and determine the positions of the marks.
Upvotes: 3
Reputation: 946
The whole process, at the level at which Freetype works (glyph rendering), is described for example here. As you can see, it is anything but simple.
There exists (several) library sitting atop from Freetype, which purpose is to make that process "simple", or at least simpler; but they are working at a higher level of abstraction, so you need probably to change your paradigm. Harfbuzz is one such project, which is closely associated with Freetype.
Upvotes: 2