Reputation: 11
I have been asked this question in technical interview "what are the compilation process in c ?"
I answered:
then he continued
"After which one of these compilation process all the variables in the program are located and have addresses.... that if there are 2 variables A and B .... after which process A and B are going to have address in the memory"
(I think he meant which produced file after each process)
I finally answered that it is after the linker as extern values need to be defined but I have no clue if what I said was right or wrong.
So hopefully, there is someone can help me to understand this question
Upvotes: 0
Views: 569
Reputation: 71566
There is no one answer to the address question. And depending on the platform your variable may have more than one address.
When you compile depending on the variable it has either been allocated an offset to the stack pointer on the stack, but the stack pointer is not known until runtime of that function (usually). For .data and .bss then the compiler leaves a mechanism depending on the compiler and target as to how to reach the variables.
unsigned int x = 5;
unsigned int y;
unsigned int more_fun ( unsigned int );
unsigned int fun ( unsigned int z )
{
unsigned int a;
a = x + 1;
a = a + more_fun(y) + y + z;
return(a);
}
00000000 <fun>:
0: e92d4070 push {r4, r5, r6, lr}
4: e1a06000 mov r6, r0
8: e59f3028 ldr r3, [pc, #40] ; 38 <fun+0x38>
c: e5934000 ldr r4, [r3]
10: e59f5024 ldr r5, [pc, #36] ; 3c <fun+0x3c>
14: e5950000 ldr r0, [r5]
18: ebfffffe bl 0 <more_fun>
1c: e5953000 ldr r3, [r5]
20: e0844003 add r4, r4, r3
24: e2844001 add r4, r4, #1
28: e0844006 add r4, r4, r6
2c: e0840000 add r0, r4, r0
30: e8bd4070 pop {r4, r5, r6, lr}
34: e12fff1e bx lr
In this case z is not stored on the stack but instead a register is saved on the stack and z is stored in that register, so it doesnt have an address, relative or otherwise. x and y do have addresses to be filled in later by the linker is how this compiler and target solve the problem. This is obviously optimized. a does not have an address either it is handled in a register. Had I not optimized then a and z would have stack pointer relative storage and the globals stay global.
once linked though.
00200008 <more_fun>:
200008: e12fff1e bx lr
0020000c <fun>:
20000c: e92d4070 push {r4, r5, r6, lr}
200010: e1a06000 mov r6, r0
200014: e59f3028 ldr r3, [pc, #40] ; 200044 <fun+0x38>
200018: e5934000 ldr r4, [r3]
20001c: e59f5024 ldr r5, [pc, #36] ; 200048 <fun+0x3c>
200020: e5950000 ldr r0, [r5]
200024: ebfffff7 bl 200008 <more_fun>
200028: e5953000 ldr r3, [r5]
20002c: e0844003 add r4, r4, r3
200030: e2844001 add r4, r4, #1
200034: e0844006 add r4, r4, r6
200038: e0840000 add r0, r4, r0
20003c: e8bd4070 pop {r4, r5, r6, lr}
200040: e12fff1e bx lr
200044: 0021004c eoreq r0, r1, r12, asr #32
200048: 00210050 eoreq r0, r1, r0, asr r0
Disassembly of section .data:
0021004c <x>:
21004c: 00000005 andeq r0, r0, r5
Disassembly of section .bss:
00210050 <y>:
210050: 00000000 andeq r0, r0, r0
x and y have known/fixed addresses. So when you see an answer or comment here saying link time that is what they are talking about. In this case the compiler didnt end up needing any stack based variables, those technically would be runtime, although with a trivial program and say only one call to the function, that could be pre-determined and/or would end up being fixed, essentially link time determined where they would end up, but dont assume that, assume that non-static locals are technically determined at run time.
Now had I built with -fPIC, the access to x and y would be a double indirect, there would be a read of the global offset table, then within that is the address to the variable itself. The initial addresses ARE determined at link time, but can be modified at load time to be somewhere else.
And then there is virtual vs physical, if you are running on an operating system lets say and that doesnt have to but likely uses an mmu to allow the program to think it is in some zero based memory space (program loads at offset say 0x8000 as far as the program and toolchain are concerned), but there is a physical address which can vary for each load, or even worse if the program is swapped out it could come back somewhere else so long as the virtual space is done right the physical can be different at load time or runtime if swapped out.
That is the problem when you see questions like this in an interview or a college test. Sometimes the person asking is looking for a specific answer like linker, which while true in a great number of situations, there are exceptions. Or perhaps the person asking knows more than just enough to be dangerous and is either looking for load time or link time or runtime or is looking for a longer explanation.
There are further exceptions to these answers discussed thus far. So it is likely that the person asking had a specific answer or reason for the question which it is very likely you are not able to read their mind and get it right. So it is an unfair/bad question, I would hesitate working for a place that asks such poor questions. Unless, it is the latter they know all the nuances and are trying to see if you know all the nuances for some reason. It could be a weed out question to see who stumbles and may have nothing to do with their product or development.
I recommend you get/build some cross compilers for a few of the different gnu supported targets (say pdp-11, not joking, arm, x86, and maybe another), try different experiments like the above or disassemble actual projects you are working on and see how the tool works. If given the freedom in the interview, you can say, let me show you, and get on a laptop and bang out a simple example, if THEY are not following YOU and are getting confused, say thank you and look for a different employer. At the same time we do all day interviews with several of us taking turns with the candidate one on one, and not uncommon when we are in the post to hear, I asked this question and this was their answer. And others in the room say "I dont even know what you are trying to ask there", so sometimes it is just a bad question.
I cant imagine what kind of job would really care about such a thing, why would this be a relevant interview question? Is this a toolchain developer?
EDIT
Short answer: there is more than one correct answer, and at the same time that means the answers can contradict each other.
Compile time stack pointer relative offsets for local items are determined. But the stack pointer itself and thus the offset is a runtime thing for that function.
Link time addresses are applied to the remaining items including variables. So link time is a correct answer.
It is possible to have load time changes made, position independent code for example, so load time is a correct answer.
And then there is of course virtual addresses vs physical, the physical addresses behind the mmu are at load time, and possible to change at run time.
Upvotes: -1
Reputation: 59
I just want to add some clarification to user3386109 comment:
Hope it helps.
Upvotes: 2