Reputation: 41
I trying to align string with unicode characters.
But it doesn't works.
Spaces is not correct. :(
Lua's version is 5.1.
What is the problem?
local t =
{
"character",
"루아", -- korean
"abc감사합니다123", -- korean
"ab23",
"lua is funny",
"ㅇㅅㅇ",
"美國大將", --chinese
"qwert-54321",
};
for k, v in pairs(t) do
print(string.format("%30s", v));
end
result:----------------------------------------------
character
루아
abc감사합니다123
ab23
lua is funny
ㅇㅅㅇ
美國大將
qwert-54321
Upvotes: 3
Views: 1977
Reputation: 23757
function utf8format(fmt, ...)
local args, strings, pos = {...}, {}, 0
for spec in fmt:gmatch'%%.-([%a%%])' do
pos = pos + 1
local s = args[pos]
if spec == 's' and type(s) == 'string' and s ~= '' then
table.insert(strings, s)
args[pos] = '\1'..('\2'):rep(#s:gsub("[\128-\191]", "")-1)
end
end
return (fmt:format((table.unpack or unpack)(args))
:gsub('\1\2*', function() return table.remove(strings, 1) end)
)
end
local t =
{
"character",
"루아", -- korean
"abc감사합니다123", -- korean
"ab23",
"lua is funny",
"ㅇㅅㅇ",
"美國大將", --chinese
"qwert-54321",
"∞"
};
for k, v in pairs(t) do
print(utf8format("%30s", v));
end
But there is another problem: on most fonts korean and chinese symbols are wider than latin letters.
Upvotes: 4
Reputation: 122453
The ASCII strings are all formatted correctly, while the non-ASCII strings are not.
The reason is because, the length of the strings are counted with their number of bytes. For instance, with UTF-8 encodings,
print(string.len("美國大將")) -- 12
print(string.len("루아")) -- 6
So %s
in string.format
treat these two strings as if their width is 12 / 6.
Upvotes: 2