What does the C standard specify for the value of a character constant with a hexadecimal escape sequence?

Question

What does the C 2018 standard specify for the value of a hexadecimal escape sequence such as '\xFF'?

Consider a C implementation in which char is signed and eight bits.

Clause 6.4.4.4 tells us about character constants. In paragraph 6, it discusses hexadecimal escape sequences:

The hexadecimal digits that follow the backslash and the letter x in a hexadecimal escape sequence are taken to be part of the construction of a single character for an integer character constant or of a single wide character for a wide character constant. The numerical value of the hexadecimal integer so formed specifies the value of the desired character or wide character.

The hexadecimal integer is “FF”. By the usual rules of hexadecimal notation, its value¹ is 255. Note that, so far, we do not have a specific type: A “character” is a “member of a set of elements used for the organization, control, or representation of data” (3.7) or a “bit representation that fits in a byte” (3.7.1). When \xFF is used in '\xFF', it is a c-char in the grammar (6.4.4.4 1), and '\xFF' is an integer character constant. Per 6.4.4.4 2, “An integer character constant is a sequence of one or more multibyte characters enclosed in single-quotes, as in ’x’.”

6.4.4.4 9 specifies constraints on character constants:

The value of an octal or hexadecimal escape sequence shall be in the range of representable values for the corresponding type:

That is followed by a table that, for character constants with no prefix, shows the corresponding type is unsigned char.

So far, so good. Our hexadecimal escape sequence has value 255, which is in the range of an unsigned char.

Then 6.4.4.4 10 purports to tell us the value of the character constant. I quote it here with its sentences separated and labeled for reference:

(i) An integer character constant has type int.

(ii) The value of an integer character constant containing a single character that maps to a single-byte execution character is the numerical value of the representation of the mapped character interpreted as an integer.

(iii) The value of an integer character constant containing more than one character (e.g., ’ab’ ), or containing a character or escape sequence that does not map to a single-byte execution character, is implementation-defined.

(iv) If an integer character constant contains a single character or escape sequence, its value is the one that results when an object with type char whose value is that of the single character or escape sequence is converted to type int.

If 255 maps to an execution character, (ii) applies, and the value of '\xFF' is the value of that character. This is the first use of “maps” in the standard; it is not defined elsewhere. Should it mean anything other than a map from the value derived so far (255) to an execution character with the same value? If so, for (ii) to apply, there must be an execution character with the value 255. Then the value of '\xFF' would be 255.

Otherwise (iii) applies, and the value of '\xFF' is implementation-defined.

Regardless of whether (ii) or (iii) applies, (iv) also applies. It says the value of '\xFF' is the value of a char object whose value is 255, subsequently converted to int. But, since char is signed and eight-bit, there is no char object whose value is 255. So the fourth sentence states an impossibility.

Footnote

¹ 3.19 defines “value” as “precise meaning of the contents of an object when interpreted as having a specific type,” but I do not believe that technical term is being used here. “The numerical value of the hexadecimal integer” has no object to discuss yet. This appears to be a use of the word “value” in an ordinary sense.

What does the C standard specify for the value of a character constant with a hexadecimal escape sequence?

Footnote

Answers (1)

Related Questions