Douglas B
Douglas B

Reputation: 832

Cannot compile simple program which uses ARM Neon for Cortex A53

I am trying to cross compile a large project (XNNPACK at this specific commit) for an ARM Cortex A53 based linux system. This project uses ARMs arm_neon.h header and functions. While compiling, I get many errors of the same variety:

error: inlining failed in call to 'always_inline' 'vmaxq_f16': target specific option mismatch
| 31591 | vmaxq_f16 (float16x8_t __a, float16x8_t __b)

This error appears for multiple NEON functions, including vminq_f16, vmulq_f16, etc. To more easily troubleshoot, I created a simple test program which uses NEON:

#include <arm_neon.h>

int main() {
    float16x8_t a = vdupq_n_f16(1.0f);
    float16x8_t b = vdupq_n_f16(2.0f);
    float16x8_t c = vmulq_f16(a, b);
    return 0;
}

This program also suffers the same error when compiling with the correct architecture set on the native platform (-march=armv8-a) I tried compiling with -march=armv8.2-a+fp16 option and was able to compile, but got an illegal instruction error. I have tried multiple combinations of flags to no avail. For some extra information:

The output of cat /proc/cpuinfo gives the following (for all 4 cores available):

Features    : fp asimd aes pmull sha1 sha2 crc32 cpuid
CPU implementer : 0x41
CPU architecture: 8
CPU variant : 0x0
CPU part    : 0xd03
CPU revision    : 4

The presence of fp and asimd make me think this feature is available. Further, this arm documentation implies that it is highly likely NEON should be supported on a Cortex A processor. The specific processor implementation is the one found in the Xilinx Zynq Ultrascale+ MPSoC. Xilinx documentation for this MPSoC also implies that this NEON functionality should be present:

Advanced SIMD and floating-point extension implements Arm NEON technology... Advanced SIMD instructions are available in AArch64 and AArch32 states.

Main Questions:

At this point I am not sure if I should be trying to get the project to compile without those functions, as they are not supported on my CPU, or if they are supported and I am missing compiler options or something else. Any help in either of those directions would be greatly appreciated.

Upvotes: 0

Views: 470

Answers (0)

Related Questions