Digger
Digger

Reputation: 183

Display-width of multibyte character in C standard library – how accurate is the database?

The wcwidth call of Standard C Library returns 2 for Asian characters. Then there are Unicode symbols, like arrows. For those it returns 1. It is often the case that character is wider than single column, yet the library isn't wrong, because terminals print them at single column and allow visual overlapping, sometimes giving not bad results, like for ndash "–".

Are there characters that plainly suffer? I wonder how Asian people and people from other regions use terminals, what solutions have they developed. For example displaying a shell prompt that spans whole line and contains current directory name can be a serious problem. Can be wcwidth patched to obtain better results? Using github/wcwidth.c as a starting point, for example.

Upvotes: 0

Views: 123

Answers (1)

Thomas Dickey
Thomas Dickey

Reputation: 54465

There are differences with the ambiguous-width characters. xterm has both Markus Kuhn's original (the link you show appears to be his, with the comment-header removed), as well as an alternate version with adjustments to accommodate CJK (East Asian). Besides that, it checks at startup for usable system locale tables. Some are good enough; others are not. No one's done a systematic (unbiased) survey of what's actually good (you may see some opinions on that aspect, offered as answers).

Upvotes: 1

Related Questions