Reputation: 263
I tried to compile some inline assembler for 64-bit iOS application.
Here's an example:
int roundff(float _value) {
int res;
float temp;
asm("vcvtr.s32.f32 %[temp], %[value] \n vmov %[res], %[temp]" : [res] "=r" (res), [temp] "=w" (temp) : [value] "w" (_value));
return res;
}
and I have this errors:
Unrecognized instruction mnemonic.
But this code compiles fine:
__asm__ volatile(
"add %[result], %[b], %[a];"
: [result] "=r" (result)
: [a] "r" (a), [b] "r" (b), [c] "r" (c)
);
Than I founded that in aarch64 I have to use fcvt instead of vcvt. Because
int a = (int)(10.123);
compiles into
fcvtzs w8, s8
but I don't know how to write it in inline assembler. Something like this
int roundff(float _value)
{
int res;
asm("fcvtzs %[res], %[value]" : [res] "=r" (res) : [value] "w" (_value));
return res;
}
also doesn't work and generates this errors:
Instruction 'fcvtz' can not set flags, but 's' suffix specified.
Invalid operand for instruction.
Also I need round instead of trim. (fcvtns)
Any help? Where I can read something more about arm(32/64) asm?
UPDATE Ok. This: float res = nearbyintf(v) compiles into nice instruction frinti s0 s0. But why my inline assembler does not work on iOS using clang compiler?
Upvotes: 1
Views: 1720
Reputation: 2076
Here is how you do it:
-(int) roundff:(float)a {
int y;
__asm__("fcvtzs %w0, %s1\n\t" : "=r"(y) : "w"(a));
return y;
}
Take care,
/A
Upvotes: 3
Reputation: 363932
You can get the rounding you want using standard math.h
functions that inline to single ARM instructions. Better yet, the compiler knows what they do, so may be able to optimize better by e.g. proving that the integer can't be negative, if that's the case.
Check godbolt for the compiler output:
#include <math.h>
int truncate_f_to_int(float v)
{
int res = v; // standard C cast: truncate with fcvtzs on ARM64
// AMD64: inlines to cvtTss2si rax, xmm0 // Note the extra T for truncate
return res;
}
int round_f_away_from_zero(float v)
{
int res = roundf(v); // optimizes to fcvtas on ARM64
// AMD64: AND/OR with two constants before converting with truncation
return res;
}
//#define NOT_ON_GODBOLT
// godbolt has a broken setup and gets x86-64 inline asm for lrintf on ARM64
#if defined(NOT_ON_GODBOLT) || defined (__x86_64__) || defined(__i386__)
int round_f_to_even(float v)
{
int res = lrintf(v); // should inline to a convert using the current rounding mode
// AMD64: inlines to cvtss2si rax, xmm0
// nearbyintf(v); // ARM64: calls the math library
// rintf(v); // ARM64: calls the math library
return res;
}
#endif
godbolt has a buggy install of headers for non-x86 architectures: they still uses x86 math headers, including inline asm.
Also, your roundff
function with inline asm for fcvtzs
compiled just fine on godbolt with gcc 4.8. Maybe you were trying to build for 32bit ARM? But like I said, use the library function that does what you want, then check to make sure it inlines to nice ASM.
Upvotes: 2