Alexey Kamenskiy
Alexey Kamenskiy

Reputation: 2948

LLVM IR generated code to native code

I am learning how compilers work. For the learning I used several books and tutorials and at some point stumbled upon this issue that I cannot resolve.

The complete tutorial code I followed can be found at Github repository

This code produces IR code and successfully executes it. However if I try to save the code as example.ll file and (using llc) compile to native assembly, then this assembly fails to be compiled into native executable (using nasm and ld). I also tried to compile IR into native object files and then compile it using g++ (same as parser compiled in the make file of tutorial), which also fails. I would like to find a way to actually compile my generated IR code into executable binary file (at least for elf64).

The generated IR code [example.ll]:

; ModuleID = 'main'

@.str = private constant [4 x i8] c"%d\0A\00"

declare i32 @printf(i8*, ...)

define internal void @echo(i64 %toPrint) {
entry:
  %0 = call i32 (i8*, ...)* @printf(i8* getelementptr inbounds ([4 x i8]* @.str, i32 0, i32 0), i64 %toPrint)
  ret void
}

define internal void @main() {
entry:
  %0 = call i64 @do_math(i64 11)
  call void @echo(i64 %0)
  %1 = call i64 @do_math(i64 12)
  call void @echo(i64 %1)
  call void @printi(i64 10)
  ret void
}

declare void @printi(i64)

define internal i64 @do_math(i64 %a1) {
entry:
  %a = alloca i64
  store i64 %a1, i64* %a
  %x = alloca i64
  %0 = load i64* %a
  %1 = mul i64 %0, 5
  store i64 %1, i64* %x
  %2 = load i64* %x
  %3 = add i64 %2, 3
  ret i64 %3
}

Then via asm:

$ llc-3.5 -filetype=asm -x86-asm-syntax=intel -o example.asm example.ll
$ nasm example.asm
example.asm:1: error: attempt to define a local label before any non-local labels
example.asm:2: error: attempt to define a local label before any non-local labels
example.asm:2: error: parser: instruction expected
example.asm:3: error: attempt to define a local label before any non-local labels
example.asm:3: error: parser: instruction expected
example.asm:4: error: attempt to define a local label before any non-local labels
example.asm:4: error: parser: instruction expected
example.asm:5: error: parser: instruction expected
BB#0: # %entry:3: error: parser: instruction expected
BB#0: # %entry:12: error: parser: instruction expected
...
...
<many similar errors here>

Or via GCC:

$ llc-3.5 -filetype=obj -o example.o example.ll
$ g++ native.o example.o 
/usr/lib/gcc/x86_64-linux-gnu/4.8/../../../x86_64-linux-gnu/crt1.o: In function `_start':
(.text+0x20): undefined reference to `main'
collect2: error: ld returned 1 exit status

PS: Contribution to the repository to modify code accordingly (to make it work) would be more than welcome!

UPD: as requested

asm code:

    .text
    .file   "example.ll"
    .align  16, 0x90
    .type   echo,@function
echo:                                   # @echo
    .cfi_startproc
# BB#0:                                 # %entry
    pushq   %rax
.Ltmp0:
    .cfi_def_cfa_offset 16
    movq    %rdi, %rcx
    movl    $.L.str, %edi
    xorl    %eax, %eax
    movq    %rcx, %rsi
    callq   printf
    popq    %rax
    retq
.Ltmp1:
    .size   echo, .Ltmp1-echo
    .cfi_endproc

    .align  16, 0x90
    .type   main,@function
main:                                   # @main
    .cfi_startproc
# BB#0:                                 # %entry
    pushq   %rax
.Ltmp2:
    .cfi_def_cfa_offset 16
    movl    $11, %edi
    callq   do_math
    movq    %rax, %rdi
    callq   echo
    movl    $12, %edi
    callq   do_math
    movq    %rax, %rdi
    callq   echo
    movl    $10, %edi
    callq   printi
    popq    %rax
    retq
.Ltmp3:
    .size   main, .Ltmp3-main
    .cfi_endproc

    .align  16, 0x90
    .type   do_math,@function
do_math:                                # @do_math
    .cfi_startproc
# BB#0:                                 # %entry
    movq    %rdi, -8(%rsp)
    leaq    (%rdi,%rdi,4), %rax
    movq    %rax, -16(%rsp)
    leaq    3(%rdi,%rdi,4), %rax
    retq
.Ltmp4:
    .size   do_math, .Ltmp4-do_math
    .cfi_endproc

    .type   .L.str,@object          # @.str
    .section    .rodata,"a",@progbits
.L.str:
    .asciz  "%d\n"
    .size   .L.str, 4


    .section    ".note.GNU-stack","",@progbits

Output of nm:

$ nm example.o 
0000000000000060 t do_math
0000000000000000 t echo
0000000000000020 t main
                 U printf
                 U printi

Upvotes: 3

Views: 2735

Answers (1)

Martin T&#246;rnwall
Martin T&#246;rnwall

Reputation: 9599

The assembly file

The reason you can't assemble example.asm is probably that it is in AT&T syntax, whereas nasm expects Intel syntax. It appears that you've asked llc to output Intel syntax, but you got the flag wrong. According to this manual, it's --x86-asm-syntax (notice the double dash).

I suspect that you may be better off assembling with as (the GNU assembler) instead, as there are many mutually incompatible dialects of Intel syntax; I'm not really sure which one LLVM speaks. To do this, use the command:

$ as example.asm -o example.o

The object file

The reason you cannot link your object file with the C library is that you've declared your main function to have internal linkage (that's the define internal). Like the static keyword in C, it makes the symbol "invisible" outside of the object file, as evidenced by the lowercase 't' in the nm output.

When creating the LLVM function object for main, you should set its linkage type to llvm::GlobalValue::ExternalLinkage.

The same problem appears in the assembly file, of course - it's the absence of a .global main.


Don't take this to mean that you should give all functions external linkage; if a function is only used in the translation unit where it's defined, it really should have internal linkage. You just can't do it for main.

Upvotes: 2

Related Questions