Reputation: 2047
I am currently having a look at the TTF Specification and am noticing it is talking quite a lot about "tables". My understanding is that these tables are a representation of a data structure, however through my googling I have found no explanation as to how I might extract this data from my TTF file, how I might figure out how many characters each table occupys, how to determine one table from another, how I might determine each piece of data in each table...
Currently all I know is that I am given a data type of each piece of information (eg. here) which might be able to help me determine each piece of data in a table, but that comes back to the issue of not understanding how to determine the location of the tables itself in the file.
If anyone could explain the theory behind how this works, or show any short but well-commented code snippets to help me understand this better, it would be much appreciated.
Upvotes: 0
Views: 135
Reputation: 106126
user3159253's quite right - the very doc you link to provides all the information you need... for example:
Note that the searchRange, the entrySelector and the rangeShift are all multiplied by 16 which represents the size of a directory entry.
Table 4 : The offset subtable
Type Name Description
uint32 scaler type A tag to indicate the OFA scaler to be used to rasterize this font; see the note on the scaler type below for more information.
uint16 numTables number of tables
uint16 searchRange (maximum power of 2 <= numTables)*16
uint16 entrySelector log2(maximum power of 2 <= numTables)
uint16 rangeShift numTables*16-searchRange
Your related code might look something like:
struct Offset_Subtable;
{
uint32 scaler_type_;
uint16 numTables_;
uint16 searchRange_;
uint16 entrySelector_;
uint16 rangeShift_;
bool is_for_macOS() { return scaler_type_ == 0x74727565; }
bool is_for_windoes() { return scaler_type_ == 0x00010000; }
bool is_truetype() { return scaler_type_ == 0x74727565 ||
scaler_type_ == 0x00010000; }
bool is_postscript() { return scaler_type_ == 0x74797031; }
};
Given:
Table 5: The table directory
Type Name Description
uint32 tag 4-byte identifier
uint32 checkSum checksum for this table
uint32 offset offset from beginning of sfnt
uint32 length length of this table in byte (actual length not padded length)
You might keep coding:
struct Table_Directory_Entry
{
uint32 tag_;
uint32 checkSum_;
uint32 offset_;
uint32 length_;
};
const void* p_ttf = ...;
const Offset_Subtable* p_os = static_cast<const Offset_Subtable*>(p_ttf);
... use any data you're interested in...
for (int table_num = 1; table_num <= p_os->numTables_; ++table_num)
{
const Table_Directory_Entry* p_tde =
reinterpret_cast<const Table_Directory_Entry*>(&p_os[1]);
... use the table directory entry data ...
}
...etc....
Now, some of the raw data is quite confusing ("magic" sentinel values, integers where you'd prefer to see results as values from an enum
, numbers that need to be multiplied or divided or offset to yield the value someone might intuitively expect them to hold), so adding little helper functions like is_truetype()
is a good start, and you can go a bit further by preventing use of the raw fields by making them private
while exposing is_truetype()
et al as public
member functions.
If you need your structures to have extra data that's not part of the TTF content, then this approach of casting the raw data to your structure to help you parse/interpret it breaks down. Instead, you could use one of this raw-data-only structures for convenient interpretation of the raw memory, supporting a higher-level class that's created on the stack or heap or at global/static scope, that initialises itself given a pointer or reference to the raw-data structure. This could be done in the constructor, assignment operator, a streaming operator>>
, or any general member function - whatever you find suits your code. You can then pull out and persist parsed data in standard containers etc..
Upvotes: 0
Reputation: 17455
The first approach is to carefully study the mentioned Apple documentation, it does contain descriptions of the data structures. Also you can take a look into libfreetype
and learn how these data structures are processed in the real-world code.
Upvotes: 1