Omar Boninsegna
Omar Boninsegna

Reputation: 11

Assembly inline AT&T Type mismatch

I'm learning assembly and I found nothing that helps me do this. Is it even possible? I can't make this work.

I want this code to take the "b" value, put it in %eax and then move the content of %eax in my output and print that ASCII character, "0" in this case.

char a;
int b=48;
__asm__ ( 
//Here's the "Error: operand type mismatch for `mov'
"movl %0, %%eax;"
"movl %%eax, %1;"

:"=r"(a)
:"r" (b)
:"%eax"
);

printf("%c\n",a);

Upvotes: 1

Views: 779

Answers (2)

Cody Gray
Cody Gray

Reputation: 244682

The instruction responsible for the error is this one:

movl %0, %%eax

So, in order to figure out why it's causing an error, we need to understand what it says. It's a 32-bit MOV instruction (the l suffix in AT&T syntax means "long", aka DWORD). The destination operand is the 32-bit EAX register. The source operand is the first input/output operand, a. In other words, this:

"=r"(a)

which says that char a; is to be used as an output-only register.

As such, what the inline assembler wants to do is to generate code like the following:

movl %dl, %eax

(assuming, for the sake of argument that a is allocated in the dl register, but it could just as easily have been allocated in any of the 8-bit registers). The problem is, that code is invalid because there is an operand size mismatch. The source operand and destination operand are different sizes: one is 32 bits while the other is 8 bits. This cannot work.

A workaround is the movzx/movsx instructions (introduced with the 80386) which move an 8 (or 16) bit source operand into a 32-bit destination operand, either with zero extension or sign extension, respectively. In AT&T syntax, the form that moves an 8-bit source into a 32-bit destination would be movzbl (for zero extension, used with unsigned values) or movsbl (for sign extension, used with signed values).

But wait—this is the wrong workaround. Your code is invalid for another reason: a is uninitialized! And not only is a uninitialized, but you've told the inline assembler via the output constraints it is an output-only operand (the = sign)! So you can't read from it—you can only store into it.

You have your operand notation backwards. What you really wanted was something like the following:

__asm__( 
        "movl %1, %%eax;"
        "movl %%eax, %0;"

        : "=r"(a)
        : "r" (b)
        : "%eax"
       );

Of course, that's still going to give you an operand size mismatch, but it's now on the second assembly instruction. What this is telling the inline assembler to emit is the following code:

    movl $48,  %edx
    movl %edx, %eax
    movl %eax, %dl

which is invalid because a 32-bit source (%eax) cannot be moved into an 8-bit destination (%dl). And you can't fix this with movzx/movsx, because that is used to extend, not truncate. The way to write this would be the following:

    movl $48,  %edx
    movl %edx, %eax
    movb %al,  %dl

where the last instruction is an 8-bit move, from an 8-bit source register to an 8-bit destination register.

In inline assembly, this would be written as:

__asm__( 
        "movl %1, %%eax;"
        "movb %%al, %0;"

        : "=r"(a)
        : "r" (b)
        : "%eax"
       );

However, this is not the correct way to use inline assembly. You've manually hard-coded the EAX register inside of the inline assembly block, which means that you had to clobber it. The problem with this is that it ties the compiler's hands behind its back when it comes to register allocation. What you're supposed to do is put everything that goes into and out of the inline assembly block in the input and output operands. This lets the compiler handle all register allocation in the most optimal way possible. The code should look as follows:

char a;
int  b = 48;
int temp;
__asm__( 
        "movl %2, %0\n\t"
        "movb %b0, %1"

        : "=r"(temp),
          "=r"(a)
        : "r" (b)
        :
       );

A lot of changes happened here:

  • I introduced another temporary variable (appropriately named temp) and added it to the output-only operands list. This causes the compiler to allocate a register for it automatically, which we then use inside of the asm block.
  • Now that we're letting the compiler do the register allocation, we don't need a clobber list, so that's left empty.
  • The b modifier is needed on the source operand for the movb instruction to ensure that the byte-sized portion of that register is used, rather than the entire 32-bit register.
  • Instead of using semicolons at the end of each asm instruction, I used \n\t (except on the last one). This is what is recommended for use in inline assembly blocks, and it gets you nicer assembly output listings because it matches what the compiler does internally.

Even better would be to introduce symbolic names for the operands, making the code more readable:

char a;
int  b = 48;
int temp;
__asm__( 
        "movl %[input], %[temp]\n\t"
        "movb %b[temp], %[dest]"

        : [temp]  "=r"(temp),
          [dest]  "=r"(a)
        : [input] "r" (b)
        :
       );

And, at this point, if you hadn't noticed already, you'd see that this code is enormously silly. You don't need all those temporaries and register-register shuffling. You can just do:

    movl $48, %eax

and the value 48 is already in al, since al is the low 8 bits of the 32-bit register eax.

Or, you can do:

    movb $48, %al

which is just an 8-bit move of the value 48 explicitly into the 8-bit register al.

But, in fact, if you're calling printf, the argument must be passed as an int (not a char, since it's a variadic function), so you definitely want:

    movl $48, %eax

When you start using inline assembly, the compiler can't easily optimize through it, so you get inefficient code. All you really needed was:

int a = 48;
printf("%c\n",a);

Which produces the following assembly code:

    pushl   $48
    pushl   $AddressOfFormatString
    call    printf
    addl    $8, %esp

or, equivalently:

    movl    $48, %eax
    pushl   %eax
    pushl   $AddressOfFormatString
    call    printf
    addl    $8, %esp

Now, I imagine you're saying to yourself something like: "Yes, but if I do that, then I'm not using inline assembly!" To which my response is: exactly. You don't need inline assembly here, and in fact, you should not be using it, because it just causes problems. It's more difficult to write and leads to inefficient code generation.

If you want to learn assembly language programming, get an assembler and use that—not a C compiler's inline assembler. NASM is a popular and excellent choice, as is YASM. If you want to stick with using the Gnu assembler so you can stick with this tortuous AT&T syntax, then run as.

Upvotes: 1

rkhb
rkhb

Reputation: 14399

Since a is defined as character (char a;), :"=r"(a) will assign a 8-byte register. The 32-byte register EAX cannot be loaded with an 8-byte register - movl %dl, %eax (movl %0, %%eax) will cause this error. There are the sign extend and zero extend instructions movzx and movsx (Intel syntax), in AT&T syntax: movs... and movz... for this purpose.

Change

movl %0, %%eax;

to

movzbl %0, %%eax;

Upvotes: 0

Related Questions