theta
theta

Reputation: 25601

Print number of characters in UTF-8 string

For example:

local a = "Lua"
local u = "Луа"
print(a:len(), u:len())

output:

3   6

How can I output number of characters in utf-8 string?

Upvotes: 4

Views: 3564

Answers (4)

Yu Hao
Yu Hao

Reputation: 122383

In Lua 5.3, you can use utf8.len to get the length of a UTF-8 string:

local a = "Lua"
local u = "Луа"
print(utf8.len(a), utf8.len(u))

Output: 3 3

Upvotes: 3

Michal Kottman
Michal Kottman

Reputation: 16753

If you need to use Unicode/UTF-8 in Lua, you need to use external libraries, because Lua only works with 8-bit strings. One such library is slnunicode. Example code how to calculate the length of your string:

local unicode = require "unicode"
local utf8 = unicode.utf8

local a = "Lua"
local u = "Луа"
print(utf8.len(a), utf8.len(u)) --> 3    3

Upvotes: 6

daven11
daven11

Reputation: 3025

Another alternative is to wrap the native os UTF-8 string functions and use the os functions to do the heavy lifting. This depends on which OS you use - I've done this on OSX and it works a treat. Windows would be similar. Of course it opens another can of worms if you just want to run a script from the command line - depends on your app.

Upvotes: 0

Nicol Bolas
Nicol Bolas

Reputation: 473272

You don't.

Lua is not Unicode aware. All it sees is a string of bytes. When you ask for the length, it gives you the length of that byte string. If you want to use Lua to interact in some way with Unicode strings, you have to either write a Lua module that implements those interactions or download such a module.

Upvotes: 2

Related Questions