Kyle
Kyle

Reputation: 990

Do modern terminals generally render all utf-8 characters correctly?

I am writting an application in C that will be ran in a terminal, and it would be handy but not necesary to use some of the less used unicode characters. From my experimentation, I have not had any trouble rendering them. However, I would not use any non ascii characters if it were a likely source of trouble in the future.

So, in short, can I count on just about any terminal or terminal emulator in the modern *nix world (mainly linux, freebsd, and osx) to properly render arbitrary utf-8 characters?

If I cannot make such an assumption, there are particular subsets of unicode characters defined for various purposes, so would some such subset at least be reliably rendered in any likely modern *nix terminal or terminal emulator?

NOTE: When I say arbitrary, I do mean arbitrary: any unicode characters. But for completeness of my question, I will note that I am primarily interested in arrows and mathematical characters, this link has lists of both: https://en.wikipedia.org/wiki/Unicode_symbols.

Upvotes: 4

Views: 1025

Answers (2)

nwellnhof
nwellnhof

Reputation: 33658

For the most part, this depends on the font, not the terminal. But there are a couple of things the terminal software has to take into account. For example, halfwidth and fullwidth forms of CJK characters.

Also, Unicode characters are added on a regular basis. There's no way that every font and terminal software is automatically updated as soon as a new version of the Unicode standard is released.

In general, you should assume that there are always Unicode characters that are not rendered correctly, even on a modern terminal.

Upvotes: 1

Michael Aaron Safyan
Michael Aaron Safyan

Reputation: 95579

No, you should not assume that. Even in a modern system, the set of fonts installed, the font used by the terminal application, and environment variables such as LANG, LC_*, etc. may influence whether certain characters can be displayed correctly on the terminal or not.

You might be able to make reasonable guesses based on the value of the TERM, LANG, and LC_* environment variable as to what is supported, but it's still going to be a guess. I'd suggest either not relying on it at all or providing some means of enabling/disabling the use (via an environment variable and/or via commandline flags to the application).

Upvotes: 6

Related Questions