Reputation: 643
I've looked through the links What is the difference between exit and return? and return statement vs exit() in main() to find the answer, but in vain.
Problem with the first link is that the answer assumes return
from any function. I want to know the exact difference between the two when in main() function. Even if there's a little difference I'd like to know what it is. Which is preferred and why? Is there any performance gain in using return
over exit() (or exit() over return
) with all sorts of compiler optimizations turned off?
Problem with the second link is I'm not interested in knowing what happens in C++. I want the answer specifically pertaining to C.
EDIT: After recommendation by a person, I actually tried to compare the assembly output of the following programs:
Note: Using gcc -S <myprogram>.c
Program mainf.c:
int main(void){
return 0;
}
Assembly output:
.file "mainf.c"
.text
.globl main
.type main, @function
main:
.LFB0:
.cfi_startproc
pushq %rbp
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
movq %rsp, %rbp
.cfi_def_cfa_register 6
movl $0, %eax
popq %rbp
.cfi_def_cfa 7, 8
ret
.cfi_endproc
.LFE0:
.size main, .-main
.ident "GCC: (Ubuntu 4.9.2-10ubuntu13) 4.9.2"
.section .note.GNU-stack,"",@progbits
Program mainf1.c:
#include <stdlib.h>
int main(void){
exit(0);
}
Assembly output:
.file "mainf1.c"
.text
.globl main
.type main, @function
main:
.LFB2:
.cfi_startproc
pushq %rbp
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
movq %rsp, %rbp
.cfi_def_cfa_register 6
movl $0, %edi
call exit
.cfi_endproc
.LFE2:
.size main, .-main
.ident "GCC: (Ubuntu 4.9.2-10ubuntu13) 4.9.2"
.section .note.GNU-stack,"",@progbits
Noting that I'm not well versed with assembly, I can see some differences between the 2 programs with the exit()
version being shorter than return
version. What's the difference?
Upvotes: 7
Views: 4533
Reputation: 753605
One major difference between using return
and calling exit()
in the main()
program is that if you call exit()
, the local variables in the main()
still exist and are valid, whereas if you return
, they are not.
This matters if you've done anything such as:
#include <stdio.h>
#include <stdlib.h>
static void function_using_stdout(void)
{
char space[512];
char *base = space;
for (int j = 0; j < 10; j++)
{
base += sprintf(base, "Hysterical raisins #%d (continued) ", j+1);
printf("%d..%d: %.24s\n", j*24, j*24+23, space + j * 24);
}
printf("Catastrophic elegance\n");
}
int main(int argc, char **argv)
{
char buffer[64]; // Deliberately rather small
setvbuf(stdout, buffer, _IOFBF, sizeof(buffer));
atexit(function_using_stdout);
for (int i = 0; i < 3; i++)
function_using_stdout();
printf("All done - exiting now\n");
if (argc > 1)
return 1;
else
exit(2);
}
because now the function called (via atexit()
) from the startup code that called main()
doesn't have a valid buffer for standard output. Whether it crashes or merely gets thoroughly confused or prints garbage or appears to work is open to debate.
I called the program hysteresis
. When run with no arguments, it used exit()
and worked correctly/sanely (the local space
variable in function_using_stdout()
was not sharing space with the I/O buffer for stdout
):
$ ./hysteresis
'hysteresis' is up to date.
0..23: Hysterical raisins #1 (c
24..47: ontinued) Hysterical rai
48..71: sins #2 (continued) Hyst
72..95: erical raisins #3 (conti
96..119: nued) Hysterical raisins
120..143: #4 (continued) Hysteric
144..167: al raisins #5 (continued
168..191: ) Hysterical raisins #6
192..215: (continued) Hysterical r
216..239: aisins #7 (continued) Hy
Catastrophic elegance
0..23: Hysterical raisins #1 (c
24..47: ontinued) Hysterical rai
48..71: sins #2 (continued) Hyst
72..95: erical raisins #3 (conti
96..119: nued) Hysterical raisins
120..143: #4 (continued) Hysteric
144..167: al raisins #5 (continued
168..191: ) Hysterical raisins #6
192..215: (continued) Hysterical r
216..239: aisins #7 (continued) Hy
Catastrophic elegance
0..23: Hysterical raisins #1 (c
24..47: ontinued) Hysterical rai
48..71: sins #2 (continued) Hyst
72..95: erical raisins #3 (conti
96..119: nued) Hysterical raisins
120..143: #4 (continued) Hysteric
144..167: al raisins #5 (continued
168..191: ) Hysterical raisins #6
192..215: (continued) Hysterical r
216..239: aisins #7 (continued) Hy
Catastrophic elegance
All done - exiting now
0..23: Hysterical raisins #1 (c
24..47: ontinued) Hysterical rai
48..71: sins #2 (continued) Hyst
72..95: erical raisins #3 (conti
96..119: nued) Hysterical raisins
120..143: #4 (continued) Hysteric
144..167: al raisins #5 (continued
168..191: ) Hysterical raisins #6
192..215: (continued) Hysterical r
216..239: aisins #7 (continued) Hy
Catastrophic elegance
$
When called with at least one argument, things went haywire (the local space
variable in function_using_stdout()
was probably sharing space with the I/O buffer for stdout
— unless that was being used by the code that executes the functions registered with atexit()
):
$ ./hysteresis aleph
0..23: Hysterical raisins #1 (c
24..47: ontinued) Hysterical rai
48..71: sins #2 (continued) Hyst
72..95: erical raisins #3 (conti
96..119: nued) Hysterical raisins
120..143: #4 (continued) Hysteric
144..167: al raisins #5 (continued
168..191: ) Hysterical raisins #6
192..215: (continued) Hysterical r
216..239: aisins #7 (continued) Hy
Catastrophic elegance
0..23: Hysterical raisins #1 (c
24..47: ontinued) Hysterical rai
48..71: sins #2 (continued) Hyst
72..95: erical raisins #3 (conti
96..119: nued) Hysterical raisins
120..143: #4 (continued) Hysteric
144..167: al raisins #5 (continued
168..191: ) Hysterical raisins #6
192..215: (continued) Hysterical r
216..239: aisins #7 (continued) Hy
Catastrophic elegance
0..23: Hysterical raisins #1 (c
24..47: ontinued) Hysterical rai
48..71: sins #2 (continued) Hyst
72..95: erical raisins #3 (conti
96..119: nued) Hysterical raisins
120..143: #4 (continued) Hysteric
144..167: al raisins #5 (continued
168..191: ) Hysterical raisins #6
192..215: (continued) Hysterical r
216..239: aisins #7 (continued) Hy
Catastrophic elegance
Al) Hysterical raisins #2 (continued) l raisins #1 (c
24..47: ontinued) Hysterical rai
48..71: l rai
48..71: nued) Hyst
72..95: 71: nued) Hyst
72..95: 7
96..119: nued) Hysterical raisins
120..143: #4 (continued) Hysteric
144..167: al raisins #5 (continued
168..191: ) Hysterical raisins #6
192..215: (continued) Hysterical r
216..239: aisins #7 (continued) Hy
Catastrophic elegance
$
Most of the time, this sort of thing isn't a problem. However, when it matters, it really does matter. And, note, it isn't visible as a problem until the program is exiting — which can make it tricky to debug.
Upvotes: 2
Reputation: 3094
Disclaimer: This answer does not quote the C Standards.
Both the methods jump into GLibC code, and to know exactly what that code is doing or which one is faster or more efficient, you'll need to read them. If you want to know more about the GLibC, you should check the sources for the GCC and GLibC. There are links in the end for those.
First: there's a difference between exit(3) and _exit(2). The first is a GLibC wrapper around the second, which is a system call. The one we use in our program, and requires the inclusion of stdlib.h
is exit(3)
- the GLibC wrapper, not the system call.
Now, programs are not just your simple instructions. They contain heavy loads of GLibC's own instructions. These GLibC functions serve several purposes related to loading and providing the library functionality you use. For that to work GLibC must be "inside" your program.
So, how is GLibC inside your program? Well, it puts itself there through your compiler (it sets some static code and some hooks into the dynamic library) - most likely you're using gcc.
I suppose you know what stack frames are, so I won't explain what they are. The cool thing to notice is that main()
itself has it's own stack frame. And that stack frame returns somewhere and it must return... But, to where?
Lets compile the following:
int main(void)
{
return 0;
}
And compile and debug it with:
$ gcc -o main main.c
$ gdb main
(gdb) disass main
Dump of assembler code for function main:
0x00000000004005e8 <+0>: push %rbp
0x00000000004005e9 <+1>: mov %rsp,%rbp
0x00000000004005ec <+4>: mov $0x0,%eax
0x00000000004005f1 <+9>: pop %rbp
0x00000000004005f2 <+10>: retq
End of assembler dump.
(gdb) break main
(gdb) run
Breakpoint 1, 0x00000000004005ec in main ()
(gdb) stepi
...
Now, stepi
will make for the fun part. This will jump one instruction at a time, so it's perfect to follow function calls. After you press run stepi
for the first time, just hold your finger on ENTER until you get tired.
What you must observe is the sequence in which functions are called with this method. You see, ret
is a "jumping" instruction (edit: after David Hoelzer comment, I see that calling ret
a simple jump is an over-generalization): after we pop rbp
, ret
itself will pop the return pointer from the stack and jump to it. So, if GLibC built that stack frame, retq
is making our return 0;
C statement jump right into GLibC's own code! How clever!
The order of function calls I got started roughly like this:
__libc_start_main
exit
__run_exit_handlers
_dl_fini
rtld_lock_default_lock_recursive
_dl_fini
_dl_sort_fini
Compiling this:
#include <stdlib.h>
int main(void)
{
exit(0);
}
And compiling and debugging...
$ gcc -o exit exit.c
$ gdb exit
(gdb) disass main
Dump of assembler code for function main:
0x0000000000400628 <+0>: push %rbp
0x0000000000400629 <+1>: mov %rsp,%rbp
0x000000000040062c <+4>: mov $0x0,%edi
0x0000000000400631 <+9>: callq 0x4004d0 <exit@plt>
End of assembler dump.
(gdb) break main
(gdb) run
Breakpoint 1, 0x000000000040062c in main ()
(gdb) stepi
...
And the function sequence I got was:
exit@plt
??
_dl_runtime_resolve
_dl_fixup
_dl_lookup_symbol_x
do_lookup_x
check_match
_dl_name_match
strcmp
There's a cool tool for printing the symbols defined within a binary. It's nm. I suggest you take a look into it as it will give you an idea of how much "crap" it's added in a simple program like the ones above.
To use it in the simplest form:
$ nm main
$ nm exit
That will print a list of symbols in the file. Note that this list does not include references these functions will make. So if a given function in this list calls another function, the other probably won't be in the list.
It depends heavily on the way the GLibC choses to handle a simple stack frame return from main
and how it implements the exit
wrapper. In the end, the _exit(2)
system call will get called and you'll exit your process.
Finally, to really answer your question: both the methods jump into GLibC code, and to know exactly what that code is doing you'll need to read it. If you want to know more about the GLibC, you should check the sources for the GCC and GLibC.
stdlib/exit.c
and stdlib/exit.h
for the implementations.kernel/exit.c
for the _exit(2)
system call implementation, and include/syscalls.h
for the preprocessor magic behind it.gcc
(compiler, not suite) sources, and would appreciate if anyone could point out where the runtime sequence is defined.Upvotes: 5
Reputation: 206567
There is practically no difference between calling exit
or executing return
from main
as long as main
returns a type that is compatible with int
.
From the C11 Standard:
5.1.2.2.3 Program termination
1 If the return type of the
main
function is a type compatible withint
, a return from the initial call to themain
function is equivalent to calling theexit
function with the value returned by themain
function as its argument; reaching the}
that terminates themain
function returns a value of 0. If the return type is not compatible withint
, the termination status returned to the host environment is unspecified.
Upvotes: 6
Reputation: 16331
Functionally, from the main()
function there is really no difference in C. For example, even if you defined a function handler with the atexit()
library call, both return()
and exit()
from main will call that function pointer.
The exit()
call, however, has the flexibility that you can use it to cause a program to exit with a return code from any point within the code.
There are the technical differences. If you compile the following to assembly:
int main()
{
return 1;
}
the final portion of that code will be:
movl $1, %eax
movl $0, -4(%rbp)
popq %rbp
retq
On the other hand, the following code compiled to assembly:
#include<stdlib.h>
int main()
{
exit(1);
}
will be identical in all respects except that it ends as follows:
subq $16, %rsp
movl $1, %edi
movl $0, -4(%rbp)
callq _exit
Aside from the 1 being put into EDI
rather than EAX
as is required on the platform where I compiled this code as the calling convention to the _exit
call, you'll note two differences. First, a stack alignment operation takes place to prepare for the function call. Second, rather than terminating with a retq
, we are now calling into the system library, which will handle the final return code and return.
Upvotes: 4
Reputation: 10998
exit
is a system call while return
is an instruction of the language.
exit
terminates current process, return
returns from a function call.
In the main()
function, they both accomplish the same thing:
int main() {
// code
return 0;
}
int main() {
// code
exit(0);
}
While in a function:
void f() {
// code
return; // return to where it was called from.
}
void f() {
// code
exit(0); // terminates program
}
Upvotes: 3