Reputation: 589
In C, it's true that:
[8-bit] signed char: -127 to 127
[8-bit] unsigned char: 0 to 255
But what does really happen in memory? Is a signed char represented in two's complement and a unsigned char represented without any specific representation (that is, a sequence of 11111111)?
How does the executable keep track of the variable type it's reading, to figure out whether the value in the CPU register is to be interpreted as two's complement or not? Is there some metadata that associates a variable name with its type?
Thanks!
Upvotes: 0
Views: 136
Reputation: 49279
C is a strongly typed language. The interpretation of memory is entirely defined by the context. That is, the type is (sufficiently well in the case of dynamic dispatch) known at compile time and the compiler makes all the decisions in advance. For the sake of performance, runtime checks are reduced to the bare minimum (in C to none unless you implement dynamic dispatch or RTTI manually).
In C (and C++) you can easily interpret the same memory location in different ways, all you have to do is acquire a pointer to it and cast it to a different type. Very unsafe if you don't know what you are doing.
Upvotes: 2
Reputation: 4849
The internal representation of numbers is not part of C language, it's a feature of the architecture of the machine itself. Most implementations use 2's complement because it makes addition and subtraction the same binary operation (signed and unsigned operations are identical).
FYI Almost all existing CPU hardware uses two's complement, so it makes sense that most programming languages do, too.
Upvotes: 0
Reputation: 1863
There is no meta data. The final execution is done by the underlying hardware because the compiler uses different instructions when doing some operations on these types. It becomes more obvious when you compare the assembly.
void test1()
{
char p = 0;
p += 3;
}
void test2()
{
unsigned char p = 0;
p += 3;
}
What you see here are the instructions compiled by the compiler form the source posted above. Compiled with no optimization -O0
this is the created assembly of clang 3.7. You can ignore most of the instructions, if you are not familiar with them. Keep the focus on movsx
and movzx
. These two instructions make the difference how the memory location is treated.
test1(): # Instructions for test1
push rbp
mov rbp, rsp
mov byte ptr [rbp - 1], 0
movsx eax, byte ptr [rbp - 1] <-- Move byte to word with sign-extension
add eax, 3
mov cl, al
mov byte ptr [rbp - 1], cl
pop rbp
ret
test2(): # Instructions for test2
push rbp
mov rbp, rsp
mov byte ptr [rbp - 1], 0
movzx eax, byte ptr [rbp - 1] <-- Move byte to word with zero-extension
add eax, 3
mov cl, al
mov byte ptr [rbp - 1], cl
pop rbp
ret
Upvotes: 5