jcdmelo
jcdmelo

Reputation: 11

GCC error for ""vmull.u16 q7, d19, d8[0]" but not for ""vmull.u16 q7, d19, d7[0]"

I am using Arm GNU Toolchain 12.2.Rel1 (Build arm-12.24)) 12.2.1 20221205 on Windows 11, and on compilation of a sequence of NEON instructions (vector multiplication by scalar):

vmull.u16 q7, d19, d0[0] vmull.u16 q7, d19, d8[0]

the first one compiles correctly but for the second one I get an error message:

ccuFDHko.s:5546: Error: scalar out of range for multiply instruction -- `vmull.u16 q7,d19,d8[0]'

After testing a couple more of combinations of different registers for the three parameters, I am inclined to conclude that only registers lower than d8 can be used for the scalar (third parameter).

I did not find any reference on this restriction on NEON Programmer's Guide nor on ARM site.

Also, when I used the intrinsic: uint32x4_t vmull_lane_u16(uint16x4_t vec1, uint16x4_t val2, __constrange(0, 3) int val3);

it always compiled with "d7[0]" as the scalar.

I appreciate getting any hints on this behavior or a reasonable explanation.

Thanks, Julio

Upvotes: 1

Views: 47

Answers (0)

Related Questions