Reputation: 1685
Today I learned that if you declare a char variable (which is 1 byte), the assembler actually uses 4 bytes in memory so that the boundaries lie on multiples of the word size.
If a char variable uses 4 bytes anyway, what is the point of declaring it as a char? Why not declare it as an int? Don't they use the same amount of memory?
Upvotes: 2
Views: 720
Reputation: 223231
When you are writing in assembly language and declare space for a character, the assembler allocates space for one character and no more. (I write in regard to common assemblers.) If you want to align objects in assembly language, you must include assembler directives for that purpose.
When you write in C, and the compiler translates it to assembly and/or machine code, space for a character may be padded. Typically this is not done because of alignment benefits for character objects but because you have several things declared in your program. For example, consider what happens when you declare:
char a;
char b;
int i;
char c;
double d;
A naïve compiler might do this:
a
at the beginning of the relevant memory, which happens to be aligned to a multiple of, say, 16 bytes.b
.int i
which needs four bytes. On this machine, int
objects must be aligned to multiples of four bytes, or a program that attempts to access them will crash. So the compiler skips two bytes and then sets aside four bytes for i
.c
.d
. This makes d
aligned to a multiple of eight bytes, which is beneficial on this hypothetical machine.So, even with a naïve compiler, a character object does not require four whole bytes to itself. It can share with neighbor character objects, or other objects that do not require greater alignment. But there will be some wasted space.
A smarter compiler will do this:
d
.i
. Note that i
is aligned to a multiple of four bytes because it follows d
, which is an eight-byte object aligned to a multiple of eight bytes.a
, b
, and c
.This sort of reordering avoids wasting space, and any decent compiler will use it for memory that it is free to arrange (such as automatic objects on stack or static objects in global memory).
When you declare members inside a struct, the compiler is required to use the order in which you declare the members, so it cannot perform this reordering to save space. In that case, declaring a mixture of character objects and other objects can waste space.
Upvotes: 5
Reputation: 71556
Others have for the most part answered this. Assuming a char is a single byte, does declaring a char mean that it always pads to an alignment? Nope, some compilers do by default some dont, and many you can change the default using some sort of command somewhere. Does this mean you shouldnt use a char? It depends, first off the padding doesnt always happen so the few wasted bytes dont always happen. You are programming in a high level language using a compiler so if you think that you have only 3 wasted bytes in your whole binary...think again. Depending on the architecture using chars can have some savings, for example loading immediates saves you three bytes or more on some architectures. Other architectures just to do simple operations with the register extra instructions are required to sign extend or clip the larger register to behave like a byte sized register. If you are on a 32 bit computer and you are using an 8 bit character because you are only counting from 1 to 100, you might want to use a full sized int, in the long run you are probably not saving anyone anything by using the char. Now if this is an 8086 based pc running dos, that is a different story. Or an 8 bit microcontroller, then you want to lean toward the 8 bit variables as much as possible.
Upvotes: 0
Reputation: 121759
Q: Does a program allocate four bytes for every "char" you declare?
A: No - absolutely not ;)
Q: Is it possible that, if you allocate a single byte, the program might "pad" with extra bytes?
A: Yes - absolutely yes.
The issue is "alignment". Some computer architectures must access a data value with respect to a particular offset: 16 bits, 32 bits, etc. Other architectures perform better if they always access a byte with respect to an offset. Hence "padding":
Upvotes: 3
Reputation: 477318
There may indeed not be any point in declaring a single char
variable.
There may however be many good reasons to want a char
-array, where an int
-array really wouldn't do the trick!
(Try padding a data structure with ints...)
Upvotes: 0