YepMe
YepMe

Reputation: 189

How could we get unicode from glyph id in python?

If I have glyph ids like below how can I get the unicode from them, the language is python that I am working on ? Also what I understand the second value is the glyph id but what do we call the first value and the third value?

 (582, 'uni0246', 'LATIN CAPITAL LETTER E WITH STROKE'), (583, 'uni0247', 'LATIN SMALL LETTER E WITH STROKE'), (584, 'uni0248', 'LATIN CAPITAL LETTER J WITHSTROKE'), (585, 'uni0249', 'LATIN SMALL LETTER J WITH STROKE')

Kindly reply.

Actually I am trying to get the unicode from a given ttf file in python.Here is the code :

 from fontTools.ttLib import TTFont
 from fontTools.unicode import Unicode
 from ttfquery import ttfgroups
 from fontTools.ttLib.tables import _c_m_a_p
 from itertools import chain

 ttfgroups.buildTable() 
 ttf = TTFont(sys.argv[1], 0, verbose=0, allowVID=0,
            ignoreDecompileErrors=True,
            fontNumber=-1)

 chars = chain.from_iterable([y + (Unicode[y[0]],) for y in x.cmap.items()] for x in ttf["cmap"].tables)
 print(list(chars))`

This code I got from stackoverflow only but this gives the above output and not what I require. So could anybody please tell me how to fetch the unicodes from the ttf file or is it fine to convert the glyphid to unicode, will it yield to actual unicode ?

Upvotes: 2

Views: 3539

Answers (2)

alijandro
alijandro

Reputation: 12147

Here is a simple way to find all unicode character in ttf file.

chars = []
with TTFont('/path/to/ttf', 0, ignoreDecompileErrors=True) as ttf:
    for x in ttf["cmap"].tables:
        for (code, _) in x.cmap.items():
            chars.append(chr(code))
# now chars is a list of \uxxxx characters
print(chars)

Upvotes: 0

vermillon
vermillon

Reputation: 563

You can use the first field: unichr(x[0]), or equivalently the second field. Then you remove the "uni" part ([3:]) and you convert it to a hexadecimal valu'Ɇ'e, then to a character. Of course, the first method is faster and simpler.

unichr(int(x[1][3:], 16)) #for the first item you've showed, returns 'Ɇ', for the second 'ɇ'

If you use python3, chr instead of unichr.

Upvotes: 4

Related Questions