Reputation: 301
I read something aboue string there: http://www.lua.org/pil/2.4.html
Lua is eight-bit clean and so strings may contain characters with any numeric value, including embedded zeros.
What is that eight-bit clean means?
Why it can contain characters with any numeric value ? (different with basic c strings)
Upvotes: 2
Views: 487
Reputation: 20802
The Lua string type is a counted sequence of bytes. A byte can hold any value between 0 and 255.
The string type is used for character strings. You are right, few character set encodings allow any byte value or sequence of byte values. Code page 437 is one that does; It maps 256 characters to 256 values, one byte per character. Windows-1252 does not; It maps 251 characters to 251 values, one byte per character. UTF-8 maps 1,112,064 characters to sequences of one to four bytes, where some values of bytes are not used and some sequences of values are not used.
The Lua string library does have functions that treats bytes as characters. Their behavior is influenced by the implementation's libraries, which typically uses the C runtime along with its locale features.
There are specialized libraries for Lua to explicitly handle various character set encodings.
Upvotes: 0
Reputation: 726629
There are two common ways to store strings:
When you use #1, you need to "sacrifice" one character to serve as the terminator; when you use #2, you do not have such limitation.
C uses the first method of storing strings. It uses character zero to serve as the terminator; the other 255 characters can be used to represent characters of the string.
Lua uses the second method of storing strings. All 256 possible character values, including zeros, can be used in Lua strings. For example, you can construct a three-character string from characters 'A'
, 0
, 'B'
, and Lua will treat it as a three character string. You can construct the same string in C, but its string-processing libraries will treat it as a single-character string: strlen
would return 1
, puts
will write character A
and stop, and so on.
Upvotes: 9