ddubie
ddubie

Reputation: 41

String formatting with unicode characters using Lua

I trying to align string with unicode characters.
But it doesn't works.
Spaces is not correct. :(
Lua's version is 5.1.
What is the problem?

local t = 
{
    "character",
    "루아",           -- korean
    "abc감사합니다123", -- korean
    "ab23",
    "lua is funny",
    "ㅇㅅㅇ",
    "美國大將",         --chinese
    "qwert-54321",
};

for k, v in pairs(t) do
    print(string.format("%30s", v));
end


result:----------------------------------------------
                     character  
                        루아  
          abc감사합니다123   
                          ab23  
                  lua is funny  
                      ㅇㅅㅇ   
                   美國大將 
                   qwert-54321

Upvotes: 3

Views: 1977

Answers (2)

Egor Skriptunoff
Egor Skriptunoff

Reputation: 23757

function utf8format(fmt, ...)
   local args, strings, pos = {...}, {}, 0
   for spec in fmt:gmatch'%%.-([%a%%])' do
      pos = pos + 1
      local s = args[pos]
      if spec == 's' and type(s) == 'string' and s ~= '' then
         table.insert(strings, s)
         args[pos] = '\1'..('\2'):rep(#s:gsub("[\128-\191]", "")-1)
      end
   end
   return (fmt:format((table.unpack or unpack)(args))
      :gsub('\1\2*', function() return table.remove(strings, 1) end)
   )
end

local t =
{
   "character",
   "루아",           -- korean
   "abc감사합니다123", -- korean
   "ab23",
   "lua is funny",
   "ㅇㅅㅇ",
   "美國大將",         --chinese
   "qwert-54321",
   "∞"
};

for k, v in pairs(t) do
   print(utf8format("%30s", v));
end

But there is another problem: on most fonts korean and chinese symbols are wider than latin letters.

Upvotes: 4

Yu Hao
Yu Hao

Reputation: 122453

The ASCII strings are all formatted correctly, while the non-ASCII strings are not.

The reason is because, the length of the strings are counted with their number of bytes. For instance, with UTF-8 encodings,

print(string.len("美國大將"))  -- 12
print(string.len("루아"))      -- 6

So %s in string.format treat these two strings as if their width is 12 / 6.

Upvotes: 2

Related Questions