Reputation: 89
I'm learning the C language with a book called 'C Primer Plus'. Currently, I'm reading about C data types (chapter 3) and have a question.
In the book, it is written that
Character Constants and Initialization
[...] A single character contained between single quotes is a C character constant. [...] Because characters are really stored as numeric values, you can also use the numerical code to assign values. [...] Somewhat oddly, C treats character constants as type int rather than type char. For example, on an ASCII system with a 32-bit int and an 8-bit char, the code
char grade = 'B';
represents 'B' as the numerical value 66 stored in a 32-bit unit, but grade winds up with 66 stored in an 8-bit unit.
How I understood this part was that the declare statement char grade = 'B';
will store its char variable and character constant in two places, one as the numerical value 66 in a 32-bit unit and the variable grade
with 66 in an 8-bit unit. (This was a bit confusing.) Then the very next two sentences in the book are:
This characteristic of character constants makes it possible to define a character constant such as 'FATE', with four separate 8-bit ASCII codes stored in a 32-bit unit. However, attempting to assign such a character constant to a char variable results in only the last 8 bits being used, so the variable gets the value 'E'.
What I understood here is that a multi-character constant is possible, unlike the definition the book gave at the beginning. However, it is possible with only one of two places they mentioned in the last part. Because character constants are type int
in C, and int
is 32 bits here, 'FATE' has 4 characters with 8 bits each; therefore, 4 * 8 = 32. but a char variable has only 8 bits of space, so the only last letter will be stored.
And, to apply the knowledge, I tried
/* test.c */
#include <stdio.h>
int main(void)
{
char grade = 'FATE';
printf("%d, %c", grade, grade);
return 0;
}
This gave me some warnings when compiling. The warnings were multi-character character constant and overflow in implicit constant conversion.
Then I tried 'FAT', 'FA', and 'F'. Only 'F' worked.
Finally, my questions are:
int
unit of my laptop system is only an 8-bit byte?Upvotes: 0
Views: 1510
Reputation: 1
First, you need to understand the literal constants.Often, we use a suffix to tell the computer what type of literal constant the literal constant belongs to.
like,123
10LLU
999L
.
Character constant is one of the literal constants.
The C treats all character constants as int
type(But it's not actually stored in memory yet).And the type of variable declared determines the size of the storage space.
char c = 'abc';
The computer considers abc
as an int type, but char is only one byte in size, so the last thing stored is 'c'(or 'a', depending on your complier)
printf("%c", character);
At last, C treat all literal constants as int
type(or unsigned int, long int ....).But, But the details of storage still depend on the declared type.
Upvotes: 0
Reputation: 133919
'B'
is a value of type int
, but it is not stored anywhere, just like 42 is not stored anywhere, it is a number. However, if you write 42 on paper in decimal it will take 2 digits of space on the paper. If you've got 42 apples, the number 42 need not be written down and it doesn't therefore consume space, but you still have 42 apples...
Now a char grade
is an object. An integer value of type char
- an integer of small magnitude (usually -128 ... 127, 0 ... 255) can be stored into grade
.
An object is like a field in a paper form where you can write a certain length of text. A char
on a form would have enough space only for a number in that range.
Since ASCII 'B'
has value 66, it conveniently fits into the range of char
.
An object of type int
could store an integer of possibly larger magnitude (perhaps -32768 ... 32767, or -2147483648 ... 2147483647; that is implementation-defined).
Then we come to the multicharacter constants. Unlike the book claims, the C standard states (C11 6.4.4.4p10) that
[...] The value of an integer character constant containing more than one character (e.g., 'ab') [...] is implementation-defined. [...]
i.e. an implementation (the C compiler) can define the value of FATE
as it seems fit. The author incorrectly claims that the value need to be that of 'F' * 16777216 + 'A' * 65536 + 'T' * 256 + 'E'
, but the C standard requires no such thing. This is the reason why many compilers will warn about the use of multicharacter constants (in your case you got "warning: multi-character character constant [-Wmultichar]")* - because they're not portable i.e. the same code will not behave in the same manner when compiled with different compilers.
Now, we assume that 'FATE'
indeed results in that number, i.e. 1178686533. Now how would you write that number 121686533 in a form in a field where you can only write a number between -128 ... 127? Again the behaviour is implementation-defined. The author's implementation will take that number modulo 256, which will result in 69
, which is the ASCII code of the E
letter, but the exact behaviour is again up to the compiler implementation.
To warn about that implementation-defined behaviour, compiler spat out something like warning: overflow in conversion from 'int' to 'char' changes value from '1178686533' to '69'.
The main takeaway from this is: if you intend to store the last letter of word FATE
in char grade
, you do it like this:
char grade = 'E';
The way how author does it can plausibly result in some compiler grading the authors book with grade F
...
Upvotes: 2