Reputation: 11
I'm learning assembly and I found nothing that helps me do this. Is it even possible? I can't make this work.
I want this code to take the "b" value, put it in %eax
and then move the content of %eax
in my output and print that ASCII character, "0" in this case.
char a;
int b=48;
__asm__ (
//Here's the "Error: operand type mismatch for `mov'
"movl %0, %%eax;"
"movl %%eax, %1;"
:"=r"(a)
:"r" (b)
:"%eax"
);
printf("%c\n",a);
Upvotes: 1
Views: 779
Reputation: 244682
The instruction responsible for the error is this one:
movl %0, %%eax
So, in order to figure out why it's causing an error, we need to understand what it says. It's a 32-bit MOV
instruction (the l
suffix in AT&T syntax means "long", aka DWORD). The destination operand is the 32-bit EAX
register. The source operand is the first input/output operand, a
. In other words, this:
"=r"(a)
which says that char a;
is to be used as an output-only register.
As such, what the inline assembler wants to do is to generate code like the following:
movl %dl, %eax
(assuming, for the sake of argument that a
is allocated in the dl
register, but it could just as easily have been allocated in any of the 8-bit registers). The problem is, that code is invalid because there is an operand size mismatch. The source operand and destination operand are different sizes: one is 32 bits while the other is 8 bits. This cannot work.
A workaround is the movzx
/movsx
instructions (introduced with the 80386) which move an 8 (or 16) bit source operand into a 32-bit destination operand, either with zero extension or sign extension, respectively. In AT&T syntax, the form that moves an 8-bit source into a 32-bit destination would be movzbl
(for zero extension, used with unsigned values) or movsbl
(for sign extension, used with signed values).
But wait—this is the wrong workaround. Your code is invalid for another reason: a
is uninitialized! And not only is a
uninitialized, but you've told the inline assembler via the output constraints it is an output-only operand (the =
sign)! So you can't read from it—you can only store into it.
You have your operand notation backwards. What you really wanted was something like the following:
__asm__(
"movl %1, %%eax;"
"movl %%eax, %0;"
: "=r"(a)
: "r" (b)
: "%eax"
);
Of course, that's still going to give you an operand size mismatch, but it's now on the second assembly instruction. What this is telling the inline assembler to emit is the following code:
movl $48, %edx
movl %edx, %eax
movl %eax, %dl
which is invalid because a 32-bit source (%eax
) cannot be moved into an 8-bit destination (%dl
). And you can't fix this with movzx
/movsx
, because that is used to extend, not truncate. The way to write this would be the following:
movl $48, %edx
movl %edx, %eax
movb %al, %dl
where the last instruction is an 8-bit move, from an 8-bit source register to an 8-bit destination register.
In inline assembly, this would be written as:
__asm__(
"movl %1, %%eax;"
"movb %%al, %0;"
: "=r"(a)
: "r" (b)
: "%eax"
);
However, this is not the correct way to use inline assembly. You've manually hard-coded the EAX
register inside of the inline assembly block, which means that you had to clobber it. The problem with this is that it ties the compiler's hands behind its back when it comes to register allocation. What you're supposed to do is put everything that goes into and out of the inline assembly block in the input and output operands. This lets the compiler handle all register allocation in the most optimal way possible. The code should look as follows:
char a;
int b = 48;
int temp;
__asm__(
"movl %2, %0\n\t"
"movb %b0, %1"
: "=r"(temp),
"=r"(a)
: "r" (b)
:
);
A lot of changes happened here:
temp
) and added it to the output-only operands list. This causes the compiler to allocate a register for it automatically, which we then use inside of the asm block.b
modifier is needed on the source operand for the movb
instruction to ensure that the byte-sized portion of that register is used, rather than the entire 32-bit register.\n\t
(except on the last one). This is what is recommended for use in inline assembly blocks, and it gets you nicer assembly output listings because it matches what the compiler does internally.Even better would be to introduce symbolic names for the operands, making the code more readable:
char a;
int b = 48;
int temp;
__asm__(
"movl %[input], %[temp]\n\t"
"movb %b[temp], %[dest]"
: [temp] "=r"(temp),
[dest] "=r"(a)
: [input] "r" (b)
:
);
And, at this point, if you hadn't noticed already, you'd see that this code is enormously silly. You don't need all those temporaries and register-register shuffling. You can just do:
movl $48, %eax
and the value 48
is already in al
, since al
is the low 8 bits of the 32-bit register eax
.
Or, you can do:
movb $48, %al
which is just an 8-bit move of the value 48
explicitly into the 8-bit register al
.
But, in fact, if you're calling printf
, the argument must be passed as an int
(not a char
, since it's a variadic function), so you definitely want:
movl $48, %eax
When you start using inline assembly, the compiler can't easily optimize through it, so you get inefficient code. All you really needed was:
int a = 48;
printf("%c\n",a);
Which produces the following assembly code:
pushl $48
pushl $AddressOfFormatString
call printf
addl $8, %esp
or, equivalently:
movl $48, %eax
pushl %eax
pushl $AddressOfFormatString
call printf
addl $8, %esp
Now, I imagine you're saying to yourself something like: "Yes, but if I do that, then I'm not using inline assembly!" To which my response is: exactly. You don't need inline assembly here, and in fact, you should not be using it, because it just causes problems. It's more difficult to write and leads to inefficient code generation.
If you want to learn assembly language programming, get an assembler and use that—not a C compiler's inline assembler. NASM is a popular and excellent choice, as is YASM. If you want to stick with using the Gnu assembler so you can stick with this tortuous AT&T syntax, then run as
.
Upvotes: 1
Reputation: 14399
Since a
is defined as character (char a;
), :"=r"(a)
will assign a 8-byte register. The 32-byte register EAX
cannot be loaded with an 8-byte register - movl %dl, %eax
(movl %0, %%eax
) will cause this error. There are the sign extend and zero extend instructions movzx
and movsx
(Intel syntax), in AT&T syntax: movs...
and movz...
for this purpose.
Change
movl %0, %%eax;
to
movzbl %0, %%eax;
Upvotes: 0