Reputation: 12215
Let's assume we have a text that contains a Unicode character that cannot be displayed because our font has no corresponding glyph. Usually, a placeholder is displayed instead, e.g. a rectangular block thingy (see screenshot).
Is there a "glyph not found" character that reliably produces this glyph? I'd like to write something like "If the following text contains <insert character here> then you need another font..."
in a UI.
By the way, I am not talking about � (replacement character). This one is displayed when a Unicode character could not be correctly decoded from a data stream. It does not necessarily produce the same glyph:
Upvotes: 39
Views: 27355
Reputation: 2664
There are 3 possible characters for glyph not found.
Check in Microsoft specification, topic Shape of .notdef glyph, https://learn.microsoft.com/en-us/typography/opentype/otspec170/recom#shape-of-notdef-glyph
Upvotes: -1
Reputation:
There is a notdef character that means the glyph is not found. But it has no charcode. You can use the charcodes of controll characters to insert a notdef character (like "", U+0002)
Upvotes: 3
Reputation: 96707
Unicode uses these terms:
The Unicode Standard (10.0) does not define how they have to look, but it suggests in chapter 5.3 [PDF] that implementations display
[…] distinctive glyphs that give some general indication of their type […]
to distinguish them from "unassigned code points". They give some examples:
The Unicode glossary entry says:
It often is shown as an open or black rectangle.
tl;dr: There is no standardized look/glyph, it’s up to the implementation. To help users, implementations could display glyphs that indicate what type of character it is that can’t be displayed.
Upvotes: 5
Reputation: 201758
No, there is no “glyph not found” character. Different programs use different graphic presentations. An empty narrow rectangle is a common rendering, but not the only one. It could also be a rectangle with a question mark in it or with the code number of the character, in hexadecimal, in it.
So it is better to e.g. display a small image of the character along with the character itself, so that the reader can compare them.
Upvotes: 15
Reputation: 3077
From the Unicode Spec:
U+25A1
□ WHITE SQUARE
may be used to represent a missing ideograph
→ U+20DE
$⃞ combining enclosing square
Upvotes: 26
Reputation: 2409
Use a non-character like U+10FFFF (at the very end of the Unicode space) which is 99.99% certain to not be found in the cmap table of any sane font. At least no known Windows system font maps that non-character to a glyph, and highly unlikely any Linux/Mac system font either. Even the all encompassing Last Resort font (http://www.unicode.org/policies/lastresortfont_eula.html) doesn't appear to map it. So while there is no official "glyph not found" character defined in Unicode that will map to the .notdef glyph, the above non-character is in practice guaranteed to display that glyph, whatever the glyph design is in that particular font. The .notdef glyph (glyph id 0 in OpenType) may be a simple hollow rectangle (standard), box with x, box with question mark, blank occasionally (which is bad practice), and sometimes bizarre things like spirals (in Palatino Linotype).
Upvotes: 3
Reputation: 799150
The glyph-not-found character is specified by the font engine and by the font; there is no fixed character for it.
Upvotes: 7