Mario
Mario

Reputation: 529

Lua string.format using UTF8 characters

How can I get the 'right' formatting using string.format with strings containing UTF-8 characters?

Example:

local str = "\xE2\x88\x9E"
print(utf8.len(str), string.len(str))
print(str)
print(string.format("###%-5s###", str))
print(string.format("###%-5s###", 'x'))

Output:

1   3
∞
###∞  ###
###x    ###

It looks like the string.format uses the byte length of the infinity sign instead of the "character length". Is there an UTF-8 string.format equivalent?

Upvotes: 7

Views: 6427

Answers (2)

Egor Skriptunoff
Egor Skriptunoff

Reputation: 23727

function utf8.format(fmt, ...)
   local args, strings, pos = {...}, {}, 0
   for spec in fmt:gmatch'%%.-([%a%%])' do
      pos = pos + 1
      local s = args[pos]
      if spec == 's' and type(s) == 'string' and s ~= '' then
         table.insert(strings, s)
         args[pos] = '\1'..('\2'):rep(utf8.len(s)-1)
      end
   end
   return (
      fmt:format(table.unpack(args))
         :gsub('\1\2*', function() return table.remove(strings, 1) end)
   )
end

local str = "\xE2\x88\x9E"
print(string.format("###%-5s###", str))  --> ###∞  ###
print(string.format("###%-5s###", 'x'))  --> ###x    ###
print(utf8.format  ("###%-5s###", str))  --> ###∞    ###
print(utf8.format  ("###%-5s###", 'x'))  --> ###x    ###

Upvotes: 4

Youka
Youka

Reputation: 2705

Lua added the UTF-8 library with version 5.3 with just small functionality for minimal needs. It's "fresh" and not really in focus for this language. Your issue is how the characters are interpreted & rendered but graphics isn't a point for the standard library or usual use of Lua.

For now, you should just fix your pattern for the input.

Upvotes: 1

Related Questions