Reputation: 402
This is a C program, which has been compiled to assembly using gcc -S
. How is string "Hello, world" represented in this program?
This is the C-code:
1. #include <stdio.h>
2.
3. int main(void) {
4.
5. char st[] = "Hello, wolrd";
6. printf("%s\n", st);
7.
8. return 0;
9. }
Heres the assembly code:
1. .intel_syntax noprefix
2. .text
3. .globl main
4.
5. main:
6. push rbp
7. mov rbp, rsp
8. sub rsp, 32
9. mov rax, QWORD PTR fs:40
10 mov QWORD PTR [rbp-8], rax
11. xor eax, eax
12. movabs rax, 8583909746840200520
15. mov QWORD PTR [rbp-32], rax
14. mov DWORD PTR [rbp-24], 1684828783
15. mov BYTE PTR [rbp-20], 0
16. lea rax, [rbp-32]
17. mov rdi, rax
18. call puts
19. mov eax, 0
20. mov rdx, QWORD PTR [rbp-8]
21. xor rdx, QWORD PTR fs:40
22 je .L3
22. call __stack_chk_fail
23. .L3:
24. leave
25. ret
Upvotes: 1
Views: 130
Reputation: 144780
You are using a local buffer in function main
, initialized from a string literal. The compiler compiles this initialization as setting the 16 bytes at [rbp-32]
with 3 mov
instructions. The first one via rax
, the second immediate as the value is 32 bits, the third for a single byte.
8583909746840200520
in decimal is 0x77202c6f6c6c6548
in hex, corresponding to the bytes "Hello, W"
in little endian order, 1684828783
is 0x646c726f
, the bytes "orld"
. The third mov sets the final '\0'
byte. Hence the buffer contains "Hello, World".
This string is then passed to puts
for output to stdout
.
Note that gcc
optimized the call printf("%s\n", "Hello, World");
to puts("Hello, World");
! By the way, clang
performs the same optimization.
Upvotes: 6
Reputation: 364503
Interesting.
If you'd written const char *str="..."
, gcc would have passed puts
a pointer to the string sitting there in the .rodata
section, like in this godbolt link. (Well-spotted by chqrlie that gcc is optimizing printf to puts).
Your code forces the compiler to make a writeable copy of the string literal, by assigning it to a non-const char[]
. (Actually, even with const char str[]
, gcc still generates it on the fly from mov-immediates. clang-3.7 spots the chance to optimize, though.)
Interestingly, it encodes it into immediate data, rather than copying into the buffer. If the array had been global, it would have just been sitting there in the regular .data
section, not .rodata
.
Also, in general avoid using main()
to see compiler optimization. gcc on purpose marks it as "cold", and optimizes it less. This is an advantage for real programs that do their real work in other functions. No difference in this case, renaming main. But usually if you're looking at how gcc optimizes something, it's best to write a function that takes some args, and use those. Then you don't have to worry about gcc seeing that the inputs or loop-bounds are compile-time constants, either.
Upvotes: 2