Reputation: 1436
I have been trying to understand the printf functionality for octal numbers.
If I write the code as:
int main()
{
char *s = "\123";
printf("%s", s);
}
It gives me an output as S which actually is correct since ASCII of S is 123 in octal.
But how does the compiler identify the sequence of numbers to convert from octal? For example:
char *s = "\123456"
would give an output as S456.
Is it that it takes maximum three numbers for octal conversion?
Is there a maximum limit within which the octal should be given (the maximum three-digit octal number would be 777).
Now since there are max 255 ASCII characters (octal 377) then when I try to print 777 it prints a typical � ASCII character, which is presume may be since there is no such ASCII assigned to this number. Also is this functionality a compiler/OS dependent?
Upvotes: 6
Views: 15884
Reputation: 1022
Putting a '0' before the number identifies it as octal to the compiler. E.g.: 0123
Putting '0x' before the number identifies it as hexadecimal to the compiler. E.g.: 0x123
Otherwise it's decimal. E.g.: 123
char *s = "\123456"
For your example escape sequence, the \123 is a decimal number. The compiler knows to use only three digits, because characters range from 0-255, thus it will cap out at three digits.
Upvotes: 0
Reputation: 21460
The C99 standard (the one I can look at) defines octal-escape-sequence
for strings as this:
octal-escape-sequence:
\ octal-digit
\ octal-digit octal-digit
\ octal-digit octal-digit octal-digit
Thus, any octal-escape-sequence
has at most 3 octal digits (0
to 7
).
The explanation simply states that:
The octal digits that follow the backslash in an octal escape sequence are taken to be part of the construction of a single character for an integer character constant or of a single wide character for a wide character constant. The numerical value of the octal integer so formed specifies the value of the desired character or wide character.
Also,
Each octal or hexadecimal escape sequence is the longest sequence of characters that can constitute the escape sequence.
And the following constraint:
The value of an octal or hexadecimal escape sequence shall be in the range of representable values for the type unsigned char for an integer character constant, or the unsigned type corresponding to wchar_t for a wide character constant.
Thus, a value of \777
violates the constraint if at most 8 bits are used (CHAR_BIT
< 9
). I couldn't find something about this in the spec, I guess it is undefined behavior, thus compiler dependent.
Upvotes: 3
Reputation: 225002
Yes. Three digits are the maximum for an octal character literal. From the spec 6.4.4.4 Character constants:
octal-escape-sequence:
\
octal-digit
\
octal-digit octal-digit
\
octal-digit octal-digit octal-digithexadecimal-escape-sequence:
\x
hexadecimal-digit
hexadecimal-escape-sequence hexadecimal-digit
The maximum octal escape sequence is \777
as you mention. There is no maximum limit for a hexadecimal escape sequence as you can see from the spec quote above.
There are only 128 ASCII characters (0-127). That means you can use octal \000
through \177
for ASCII. If you use a different character set, you might be able to go to \377
in an 8-bit char, and all the way to \777
(or higher, using hex escape sequences) for a wchar_t
. The spec says:
The value of an octal or hexadecimal escape sequence shall be in the range of representable values for the type
unsigned char
for an integer character constant, or the unsigned type corresponding towchar_t
for a wide character constant.
On most machines, unsigned char
is an 8-bit type, limiting your octal escape sequence to \377
in that context and the hex sequence to \xff
. In the case of a 32-bit wchar_t
context, the hex sequence could be as high as \xffffffff
.
Upvotes: 9