Per M.
Per M.

Reputation: 95

clang miss assembler error?

It seems to me, that clang++ miss errors in assembler code that g++ pick up. Or am I missing some compiler flag for clang? I'm new to assembler code.

Using clang++ I have compiled and linked my application error and warning free, yet I have had nasty segmentation faults. Switching to g++, I on the other hand I got these errors:

GO_F_ImageColourConversion.cpp: Assembler messages:
GO_F_ImageColourConversion.cpp:4679: Error: `(%rsi,%edx,2)' is not a valid base/index expression 
GO_F_ImageColourConversion.cpp:4682: Error: `(%rcx,%edx,1)' is not a valid base/index expression

I am using these compiler flags: -DLINUX -g -Wno-deprecated -D_GNU_SOURCE -D_REENTRANT -D__STDC_CONSTANT_MACROS -fPIC -fPIE

I have the following code (omitting unrelevant parts):

Ipp8u * pSrc;
Ipp8u * pDst;
int x, y;

                asm volatile
                    (
                     "movl      (%1, %0, 2), %%eax;\n"
                     "shlw      $8, %%ax;\n"
                     "shrl      $8, %%eax;\n"
                     "movw      %%ax, (%2, %0, 1);\n"

                    : /* no output */
                    : "r" (x), "r" (pSrc), "r" (pDst)
                    : "eax", "memory");
            }

From looking at this answer on SO, I realized I had a 32/64 bit isssue (I am porting to 64-bit).The Ipp8u* is 8 bit but int only 4 bit on my machine.

Changing the int to uintptr_t x, y; seems to fix the issue. Why does clang not give error on compile?

Upvotes: 0

Views: 567

Answers (1)

Peter Cordes
Peter Cordes

Reputation: 364408

gcc and clang both choke on your code for me:

6 : error: base register is 64-bit, but index register is not
"movl (%1, %0, 2), %%eax\n"
^
<inline asm>:1:13: note: instantiated into assembly here
movl (%rdi, %edx, 2), %eax

From clang 3.8 on the godbolt compiler explorer, with a function wrapped around it so it's testable, which you failed to provide in the question. Are you sure your clang was building 64bit code? (-m64, not -m32 or -mx32).

Provide a link to your code on godbolt with some version of clang silently mis-compiling it, otherwise all I can say for your actual question is just "can't reproduce".

And yes, your problem is that x is an int, and your problem is mixed register sizes in the addressing mode. (%rsi,%edx,2) isn't encodable.


Using %q0 to get %rdx doesn't guarantee that there isn't garbage in the high 32bits of the register (although it's highly unlikely). Instead, you could use "r" ((int64_t)x) to sign-extend x to 64bits.

Why do you need inline asm at all? How bad is the compiler output for your C version of this?

If you do want to use inline asm, this is much better:

uint32_t asm_tmp = *(uint32_t *)(x*2 + (char*)pSrc);  // I think I've reproduced the same pointer math as the addressing mode you used.
asm ( "shlw      $8, %w[v]\n\t"    // e.g.  ax
      "shrl      $8, %k[v]\n\t"    // e.g. eax.  potential partial-register slowdown from reading eax after writing ax on older CPUs
      : [v] "+&r" (asm_tmp)
      );
*(uint16_t *)(x + (char*)pDst) = asm_tmp;  // store the low 16

This compiles nicely with clang, but gcc is kinda braindead about generating the address. Maybe with a different expression for the addresses?

Your code was defeating the purpose of constraints by starting with a load and ending with a store. Always let the compiler handle as much as possible. It's possible you'd get better code from this without inline asm, and the compiler would understand what it does and could potentially auto-vectorize or do other transformations. Removing the need for the asm statement to be volatile with a "memory" clobber is already a big improvement for the optimizer: Now it's a pure function that the compiler knows just transforms one register.

Also see the end of this answer for more guides to writing inline asm that doesn't suck.

Upvotes: 2

Related Questions