joelw
joelw

Reputation: 431

Can the null character be used to represent the zero character?

The C99 standard requires that "A byte with all bits set to 0, called the null character, shall exist in the basic execution character set; it is used to terminate a character string." (5.2.1.2) It then goes on to list 99 other characters that must be in the execution set. Can a character set be used in which the null character is one of these 99 characters? In particular, is it allowed that '0' == '\0' ?

Edit: Everyone is pointing out that in ASCII, '0' is 0x30. This is true, but the standard doesn't mandate the used of ASCII.

Upvotes: 1

Views: 455

Answers (5)

Steve Jessop
Steve Jessop

Reputation: 279385

I don't think the standard states that each of the characters that it lists (including the null character) has a distinct value, other than that the digits do. But a "character set" containing a value 0 that allegedly represents 91 of the 100 required characters is clearly not really a character set containing the required 100 characters. So this is either:

  • part of the English-language definition of "a character set",
  • obvious from context,
  • a very minor flaw in the text of the standard, that it should spell it out to prevent wilful misinterpretation by a faithless implementer.

Take your pick.

Upvotes: 3

Alex
Alex

Reputation: 10136

In the case of the '0'='\0' you will not be able to differ end of string and '0' value.

Thus it will be a bit hard to use something like "0_any_string", as it already starts from '0'.

Upvotes: 2

SomeWittyUsername
SomeWittyUsername

Reputation: 18368

No, it can't. Character set must be described by an injective function, i.e. a function that maps each character to exactly one distinct binary value. Mapping 2 characters to the same value will make the character set non-deterministic, i.e. the computer won't be able to interpret the data to a matching character since more than one fits.

The C99 standard poses another restriction by forcing the mapping of null character to a specific binary value. Given the above paragraph this means that no other character can have a value identical to null.

Upvotes: 1

glglgl
glglgl

Reputation: 91139

No matter if you use ASCII, EBCDIC or something "self-crafted", '0' must be distinct from '\0', for the reason you mention yourself:

A byte with all bits set to 0, called the null character, shall exist in the basic execution character set; it is used to terminate a character string. (5.2.1.2)

If the null character terminates a character string, it cannot be contained in that string. It is the only character which cannot be contained in a string; all other haracters can be used and thus must be distinct from 0.

Upvotes: 3

inquam
inquam

Reputation: 12942

The integer constant literal 0 has different meanings depending upon the context in which it's used. In all cases, it is still an integer constant with the value 0, it is just described in different ways.

If a pointer is being compared to the constant literal 0, then this is a check to see if the pointer is a null pointer. This 0 is then referred to as a null pointer constant. The C standard defines that 0 cast to the type void * is both a null pointer and a null pointer constant.

What is the difference between NULL, '\0' and 0

Upvotes: -1

Related Questions