Reputation: 1965
There is a c runtime library, that according to https://en.wikipedia.org/wiki/Crt0 is in file ctr0.o
called to initialize variables before calling main. I have copied it here :
.text
.globl _start
str : .asciz "abcd\n"
_start:
xor %ebp, %ebp #basePointer == 0
mov (%rsp), %edi #argc from stack
lea 8(%rsp), %rsi #pointer to argv
lea 16(%rsp,%rdi,8), %rdx #pointer to envp
xor %eax, %eax
call main
mov %eax, %edi
xor %eax, %eax
call _exit
main:
lea str(%rip), %rdi
call puts
I have some question regarding the implementation:
What is in stack before called _start
which should be the only entry for linker? I am asking becuase there are expression such as mov (%rsp), %edi #argc from stack
, where the _start
is getting value from the stack, but _start
should not have any argc
(only main
does) nor argv
and envp
. All these arguments are part of main
function, not _start
entry point. So what is in stack before _start
?
This should be designed to provide initilization of variables from .data
or .bss
segments, but I do not see such initialization of them here. It could be related with the stack, but I do not know how. Before the variables are initialized (which should be in the ctr0.o
, here), the hold initial value and linker reserve space for them (also from that link). In what section of memory type, does gcc hold space for those not-initialized variables?
Finally, how to compile this assembly, without stdlib, but requires some of its function (puts
, _exit
) in order to work? I have tried cc -nostdlib foo.s
but
/usr/bin/ld: /tmp/ccSKxoPY.o: in function `_start':
(.text+0x21): undefined reference to `_exit'
/usr/bin/ld: /tmp/ccSKxoPY.o: in function `main':
(.text+0x2d): undefined reference to `puts'
collect2: error: ld returned 1 exit status
(Cannot use stdlib
otherwise, there would be 2 declaration of _start
entrypoint).
Upvotes: 0
Views: 1489
Reputation: 48572
- What is in stack before called
_start
which should be the only entry for linker?
This is defined by the system's ABI. I assume you're on Linux, which uses the System V ABI. In this case, the stack contains argc
, the argv
pointers (terminated by a null), the envp
pointers (terminated by a null), the auxiliary vector (terminated by a null), and finally the values pointed to by the previous pointers.
_start
should not have anyargc
(onlymain
does) norargv
andenvp
. All these arguments are part ofmain
function, not_start
entry point.
That's not right. If _start
didn't get those, then where else would main
get them from?
- This should be designed to provide initilization of variables from
.data
or.bss
segments, but I do not see such initialization of them here.
The kernel takes care of that when it maps the process into memory. The only time you'd need code to initialize them would be like in C++, if you had a variable initialized to something that wasn't a compile-time constant.
In what section of memory type, does gcc hold space for those not-initialized variables?
That's exactly what .bss
is for.
- Finally, how to compile this assembly, without stdlib, but requires some of its function (
puts
,_exit
) in order to work?
If you want to use libc functions, then you need to use libc. The right way would be to implement those functions yourself in terms of system calls. For _exit
it's simple:
_exit:
movl $60, %eax
syscall
For puts
it'd be a little bit more complicated, since you have to do strlen
yourself (hint: repnz scasb
), handle calling the write
syscall in a loop, and write a trailing newline, but it should still be perfectly doable.
Just for fun, you could try using -nostartfiles
instead of -nostdlib
and then calling the libc functions, but this will probably blow up horribly. Writing the functions yourself is definitely the better approach.
Upvotes: 2
Reputation: 18493
First of all, when using the same CPU (e.g. an x86-64 CPU), you need different crt0.S
files for different operating systems.
And you need a different crt0.S
for programs that are not started using an operating system (such as an operating system itself).
What is in stack before called
_start
which should be the only entry for linker?
This depends on the operating system. Linux would copy argc
, the arguments (argv[n]
) and the environment (environ[n]
) somewhere on the stack.
The file from your question is intended for an operating system that places argc
at rsp+0
, followed by the arguments and the environment.
However, I remember a (32-bit) OS that put argc
at esp+0x80
instead of esp+0
, so this is also possible...
As far as I know, Windows does not put anything on the stack (at least not officially). The corresponding crt0.S
code must call a function in a DLL file to get the command line arguments.
In the case of a device firmware which is started immediately after the CPU (microcontroller) start, the crt0.S
code must even set the stack pointer to a valid value first. The memory (including the stack) is often completely uninitialized in this case.
Needless to say that the stack does not contain any useful values in this case.
This should be designed to provide initilization of variables from
.data
...
In the case of a software started by an operating system, the operating system will initialize the .data
section. This means that the crt0.S
code does not have to do that.
In the case of a microcontroller program (device firmware), the crt0.S
code has to do this.
Because your file is obviously intended for an operating system, it does not initialize the .data
section.
Finally, how to compile this assembly, without stdlib ...
If you want to use the crt0.S
file from your question, you'll definitely require the _exit()
function.
And if you want to use the function puts()
in your code, you'll also need this function.
If you don't use the standard library, you'll have to write these functions yourself:
...
main:
lea str(%rip), %rdi
call puts
ret
_exit:
...
puts:
...
The exact implementation depends on the operating system you use.
puts()
will be a bit tricky to implement; write()
would be easier.
Note:
Please also don't forget the ret
at the end of the main()
function; (alternatively you can jmp
to puts()
instead of call
ing it...)
Upvotes: 3