Reputation: 22906
From C Programming Language by Brian W. Kernighan
& operator only applies to objects in memory: variables and array elements. It cannot be applied to expressions, constants or register variables.
Where are expressions and constants stored if not in memory? What does that quote mean?
E.g:
&(2 + 3)
Why can't we take its address? Where is it stored?
Will the answer be same for C++ also since C has been its parent?
This linked question explains that such expressions are rvalue
objects and all rvalue
objects do not have addresses.
My question is where are these expressions stored such that their addresses can't be retrieved?
Upvotes: 47
Views: 5424
Reputation: 5842
Consider the following function:
unsigned sum_evens (unsigned number) {
number &= ~1; // ~1 = 0xfffffffe (32-bit CPU)
unsigned result = 0;
while (number) {
result += number;
number -= 2;
}
return result;
}
Now, let's play the compiler game and try to compile this by hand. I'm going to assume you're using x86 because that's what most desktop computers use. (x86 is the instruction set for Intel compatible CPUs.)
Let's go through a simple (unoptimized) version of how this routine could look like when compiled:
sum_evens:
and edi, 0xfffffffe ;edi is where the first argument goes
xor eax, eax ;set register eax to 0
cmp edi, 0 ;compare number to 0
jz .done ;if edi = 0, jump to .done
.loop:
add eax, edi ;eax = eax + edi
sub edi, 2 ;edi = edi - 2
jnz .loop ;if edi != 0, go back to .loop
.done:
ret ;return (value in eax is returned to caller)
Now, as you can see, the constants in the code (0
, 2
, 1
) actually show up as part of the CPU instructions! In fact, 1
doesn't show up at all; the compiler (in this case, just me) already calculates ~1
and uses the result in the code.
While you can take the address of a CPU instruction, it often makes no sense to take the address of a part of it (in x86 you sometimes can, but in many other CPUs you simply cannot do this at all), and code addresses are fundamentally different from data addresses (which is why you cannot treat a function pointer (a code address) as a regular pointer (a data address)). In some CPU architectures, code addresses and data addresses are completely incompatible (although this is not the case of x86 in the way most modern OSes use it).
Do notice that while (number)
is equivalent to while (number != 0)
. That 0
doesn't show up in the compiled code at all! It's implied by the jnz
instruction (jump if not zero). This is another reason why you cannot take the address of that 0
— it doesn't have one, it's literally nowhere.
I hope this makes it clearer for you.
Upvotes: 63
Reputation: 67723
where are these expressions stored such that there addresses can't be retrieved?
Your question is not well-formed.
It's like asking why people can discuss ownership of nouns but not verbs. Nouns refer to things that may (potentially) be owned, and verbs refer to actions that are performed. You can't own an action or perform a thing.
Expressions are not stored in the first place, they are evaluated. They may be evaluated by the compiler, at compile time, or they may be evaluated by the processor, at run time.
Consider the statement
int a = 0;
This does two things: first, it declares an integer variable a
. This is defined to be something whose address you can take. It's up to the compiler to do whatever makes sense on a given platform, to allow you to take the address of a
.
Secondly, it sets that variable's value to zero. This does not mean an integer with value zero exists somewhere in your compiled program. It might commonly be implemented as
xor eax,eax
which is to say, XOR (exclusive-or) the eax
register with itself. This always results in zero, whatever was there before. However, there is no fixed object of value 0
in the compiled code to match the integer literal 0
you wrote in the source.
As an aside, when I say that a
above is something whose address you can take - it's worth pointing out that it may not really have an address unless you take it. For example, the eax
register used in that example doesn't have an address. If the compiler can prove the program is still correct, a
can live its whole life in that register and never exist in main memory. Conversely, if you use the expression &a
somewhere, the compiler will take care to create some addressable space to store a
's value in.
Note for comparison that I can easily choose a different language where I can take the address of an expression.
It'll probably be interpreted, because compilation usually discards these structures once the machine-executable output replaces them. For example Python has runtime introspection and code
objects.
Or I can start from LISP and extend it to provide some kind of addressof operation on S-expressions.
The key thing they both have in common is that they are not C, which as a matter of design and definition does not provide those mechanisms.
Upvotes: 42
Reputation: 1
Where are expressions and constants stored if not in memory
In some (actually many) cases, a constant expression is not stored at all. In particular, think about optimizing compilers, and see CppCon 2017: Matt Godbolt's talk “What Has My Compiler Done for Me Lately? Unbolting the Compiler's Lid”
In your particular case of some C code having 2 + 3
, most optimizing compilers would have constant folded that into 5, and that 5 constant might be just inside some machine code instruction (as some bitfield) of your code segment and not even have a well defined memory location. If that constant 5 was a loop limit, some compilers could have done loop unrolling, and that constant won't appear anymore in the binary code.
See also this answer, etc...
Be aware that C11 is a specification written in English. Read its n1570 standard. Read also the much bigger specification of C++11 (or later).
Taking the address of a constant is forbidden by the semantics of C (and of C++).
Upvotes: 4
Reputation: 31306
It does not really make sense to take the address to an expression. The closest thing you can do is a function pointer. Expressions are not stored in the same sense as variables and objects.
Expressions are stored in the actual machine code. Of course you could find the address where the expression is evaluated, but it just don't make sense to do it.
Read a bit about assembly. Expressions are stored in the text segment, while variables are stored in other segments, such as data or stack.
https://en.wikipedia.org/wiki/Data_segment
Another way to explain it is that expressions are cpu instructions, while variables are pure data.
One more thing to consider: The compiler often optimizes away things. Consider this code:
int x=0;
while(x<10)
x+=1;
This code will probobly be optimized to:
int x=10;
So what would the address to (x+=1)
mean in this case? It is not even present in the machine code, so it has - by definition - no address at all.
Upvotes: 5
Reputation: 213693
Such expressions end up part of the machine code. An expression 2 + 3
likely gets translated to the machine code instruction "load 5 into register A". CPU registers don't have addresses.
Upvotes: 10