ZIFF
ZIFF

Reputation: 301

Why Lua's string can contain characters with any numeric value?

I read something aboue string there: http://www.lua.org/pil/2.4.html

Lua is eight-bit clean and so strings may contain characters with any numeric value, including embedded zeros.

What is that eight-bit clean means?

Why it can contain characters with any numeric value ? (different with basic c strings)

Upvotes: 2

Views: 487

Answers (2)

Tom Blodget
Tom Blodget

Reputation: 20802

The Lua string type is a counted sequence of bytes. A byte can hold any value between 0 and 255.

The string type is used for character strings. You are right, few character set encodings allow any byte value or sequence of byte values. Code page 437 is one that does; It maps 256 characters to 256 values, one byte per character. Windows-1252 does not; It maps 251 characters to 251 values, one byte per character. UTF-8 maps 1,112,064 characters to sequences of one to four bytes, where some values of bytes are not used and some sequences of values are not used.

The Lua string library does have functions that treats bytes as characters. Their behavior is influenced by the implementation's libraries, which typically uses the C runtime along with its locale features.

There are specialized libraries for Lua to explicitly handle various character set encodings.

Upvotes: 0

Sergey Kalinichenko
Sergey Kalinichenko

Reputation: 726629

There are two common ways to store strings:

  1. Characters and Terminator
  2. Length and Characters

When you use #1, you need to "sacrifice" one character to serve as the terminator; when you use #2, you do not have such limitation.

C uses the first method of storing strings. It uses character zero to serve as the terminator; the other 255 characters can be used to represent characters of the string.

Lua uses the second method of storing strings. All 256 possible character values, including zeros, can be used in Lua strings. For example, you can construct a three-character string from characters 'A', 0, 'B', and Lua will treat it as a three character string. You can construct the same string in C, but its string-processing libraries will treat it as a single-character string: strlen would return 1, puts will write character A and stop, and so on.

Upvotes: 9

Related Questions