user3078261
user3078261

Reputation: 143

Type of character constant

I am studying physics and in the lecture notes of our programming course it is written that a character constant, in C has type char, where by character constant I mean an expression like 'x'. After asking my lecturer if that was a mistake he said no. After showing him the C90, C99 and C11, where it is clearly written that a character constant has type int he still didn't say it was a mistake.

So before asking him again I wanted to assure myself that I got it the right way, and why it is that way, because it seems like a waste of memory. Everything I found out about why it is that way, is that it is because of historic reasons, which is rather vague. Also I would like to know why in C++ , they changed the type of a character constant to char.

EDIT: thanks a lot for the answers.

Upvotes: 4

Views: 2484

Answers (3)

John Bode
John Bode

Reputation: 123458

Online C language standard, 2011 draft:

6.4.4.4 Character constants
...
2 An integer character constant is a sequence of one or more multibyte characters enclosed in single-quotes, as in 'x'. A wide character constant is the same, except prefixed by the letter L, u, or U. With a few exceptions detailed later, the elements of the sequence are any members of the source character set; they are mapped in an implementation-defined manner to members of the execution character set.
...
10 An integer character constant has type int. The value of an integer character constant containing a single character that maps to a single-byte execution character is the numerical value of the representation of the mapped character interpreted as an integer. The value of an integer character constant containing more than one character (e.g., 'ab'), or containing a character or escape sequence that does not map to a single-byte execution character, is implementation-defined. If an integer character constant contains a single character or escape sequence, its value is the one that results when an object with type char whose value is that of the single character or escape sequence is converted to type int.

So yes, in C, single character constants such as 'x' have type int.

Why this is the case is, AFAIK, largely lost to history1, although I suspect it was to minimize conversions when comparing against the results of getchar (which returns an int) or working with variadic functions like printf (which would automatically promote any expressions of type char to int).

Since C++ provided alternate mechanisms for I/O and type-generic operations, it made sense for character constants to have type char.

Does this matter in practice? Not in my experience, but my experience in text processing isn't that extensive.

I should actually read what I copy from the standard; the reason is given in that snippet. You can have multi-character constants like 'abc', which would not logically map to a single char value.


1. Now watch sometbody find an extensive quote from Ritchie or Kernighan explaining exactly why they did it

Upvotes: 6

Eric Postpischil
Eric Postpischil

Reputation: 222526

Speaking from experience, when C was young, it was simpler for expressions to work mostly in int. The early compilers were much less complicated than today’s compilers. Additionally, programming languages generally were not as strongly typed as they are today.

It is not a waste of memory for character constants to have type int, because the C standard does not specify how they have to be kept in memory. Only objects, such as one declared with int d; must be assigned memory (and even then often only in the abstract computation model of C, not necessarily in the real computer). A C implementation may introduce the value of a character constant to the instructions that execute the program in any way it pleases, as long as the result is correct. E.g., it can store the value in a few bits of an immediate field in an instruction, without using all the space anint requires.

I expect C++ changed the type of character constants to char because it is moving toward stronger typing, with the aim of reducing human errors.

Upvotes: 1

Darren Stone
Darren Stone

Reputation: 2068

You may find this code elucidating:

#include <stdio.h>
int main() {
    printf("size of char is %ld\n", sizeof(char));
    printf("size of const char is %ld\n", sizeof(const char));
    printf("size of int is %ld\n", sizeof(int));
    printf("size of 'x' is %ld\n", sizeof('x'));
}

Please compile on your system. On my system (OS X, compiling with either gcc -m32 or gcc -m64), the output is:

size of char is 1
size of const char is 1
size of int is 4
size of 'x' is 4

I hope this helps.

Upvotes: 1

Related Questions