Reputation: 364
In LLVM IR, when I want to get a value from an array, it seems that there are three ways to do this: using extractvalue, using extractelement, and using a getelementptr followed by a load.
However, from the language reference it is not clear to me in which case you should be using which one. Besides the differences (extractvalue can also access members from i.e. structs where extractelement cannot, the syntax is a bit different, GEP only does address computation where extractvalue and extractelement seem to also do a memory dereference), in which scenario would each of these instructions be used?
For instance, in the following c code
int arr[2];
// do some stuff with arr
int i = arr[0];
the third line can be written in IR as:
%0 = extractvalue [2 x i32] @arr, i32 0
%0 = extractelement [2 x i32] @arr, i32 0
%0 = load i32* getelementptr inbounds ([2 x i32]* @arr, i32 0, i32 0)
If I'm not mistaken these three IR lines do exactly the same thing. Another C program I compiled to IR contained the line
printf(" %d", a[i]);
When I compile this with clang, the corresponding IR looks like this:
%25 = load i32, i32* %i14, align 4
%26 = load i32*, i32** %a, align 4
%arrayidx18 = getelementptr inbounds i32, i32* %26, i32 %25
%27 = load i32, i32* %arrayidx18, align 4
%call20 = invoke i32 (i8*, ...) @printf(i8* getelementptr inbounds ([4 x i8], [4 x i8]* @.str.1, i32 0, i32 0), i32 %27)
So why is getelementptr used here and not i.e. extractelement? When are the other instructions used?
Upvotes: 3
Views: 3964
Reputation: 618
With LLVM IR, unlike C, arrays are value types stored in virtual registers, as such you can't GEP to element of a value-array. If you store that array somewhere in memory and have a pointer to the array, that's the time you use GEP and then a load or store.
Upvotes: 3
Reputation: 8088
In LLVM 4.0 documentation:
extractelement
suggests it is specific to vectors and returns scalar value at the index provided. (Undefined if index is out of bounds)
extractvalue
suggests it is generic for structures and arrays (aggregate types) and returns the value at of the structure field addressed by the index. It supports multiple indexes and can be used to access nested elements in a fashion similar to indexing used in getelementptr
.
getelementptr
(as it's name implies) returns a pointer to a location whereas the other operations above return a value. Hence the load
instruction requirement to get the value (or store
if writing).
I am in the process of writing a compiler that emits LLVM and I've generalized most vector and aggregate access (reads and writes) using getelementptr
so as to simplify the emit code. However; other compilers may do a deeper analysis and generate type specific LLVM operations.
Upvotes: 4