StackOverflow Questions for Tag: neon

AG1
AG1

Reputation: 6764

__builtin_neon_vqtbl4q_v: what is difference between first arg (__a) and sixth arg (__c)

Score: -1

Views: 55

Answers: 1

Read More
fuz
fuz

Reputation: 92966

How do I cast a vector to a float64_t to check a SIMD compare for all-zero?

Score: 3

Views: 118

Answers: 1

Read More
curiouscupcake
curiouscupcake

Reputation: 1277

Accelerating matrix vector multiplication with ARM Neon Intrinsics on Raspberry Pi 4

Score: 1

Views: 2096

Answers: 3

Read More
fabian
fabian

Reputation: 1839

How to Load and Store data for the new AVX-VNNI and Arm Neon MMLA instructions efficiently?

Score: 1

Views: 86

Answers: 1

Read More
miluz
miluz

Reputation: 1423

Fastest way to test a 128 bit NEON register for a value of 0 using intrinsics?

Score: 7

Views: 3883

Answers: 6

Read More
user2092113
user2092113

Reputation: 103

ARM NEON vectorization failure

Score: 5

Views: 4091

Answers: 2

Read More
ironman
ironman

Reputation: 3

Accumulate vector using Neon and print to stdout (assembly)

Score: 0

Views: 80

Answers: 2

Read More
Thomas Lavergne
Thomas Lavergne

Reputation: 38

vfmlalq_low_f16 and vfmlalq_high_f16 not setting their first operand to the result

Score: 0

Views: 37

Answers: 1

Read More
namea hang
namea hang

Reputation: 11

Which execution ports can SIMD shuffles use for AVX2 and NEON?

Score: 1

Views: 88

Answers: 1

Read More
AG1
AG1

Reputation: 6764

Pack high bit of every byte in ARM NEON, for 64 bytes like AVX512 vpmovb2m?

Score: 3

Views: 192

Answers: 1

Read More
A23149577
A23149577

Reputation: 2145

Accessing half of a register in AArch64 advanced SIMD

Score: 1

Views: 1257

Answers: 3

Read More
Steve Fan
Steve Fan

Reputation: 3361

What is the fastest algorithm in SIMD to compare a pattern with a bit mask?

Score: 1

Views: 132

Answers: 1

Read More
Morteza
Morteza

Reputation: 109

How to further optimize matrix multiplication in llm.c project?

Score: 4

Views: 171

Answers: 1

Read More
Bruno Causse
Bruno Causse

Reputation: 11

Why ARM NEON intrinsics are not faster than plain C++ for finding legal Othello moves?

Score: 1

Views: 116

Answers: 2

Read More
user27680699
user27680699

Reputation: 51

How to exactly find the first matching zero in ARM using `shrn`, `fmov`, `rbit`, `clz`?

Score: 3

Views: 133

Answers: 1

Read More
leeyee
leeyee

Reputation: 265

Compile ARM Neon intrinsics on macos (M3 chipsets) using clang

Score: 0

Views: 311

Answers: 1

Read More
Mikhail T.
Mikhail T.

Reputation: 3967

Compiling assembly-code on ARMv7: CLang vs. GNU

Score: 1

Views: 86

Answers: 1

Read More
Zvi Vered
Zvi Vered

Reputation: 613

ARM Intrinsic: Insert complex zero after each complex float sample

Score: 0

Views: 41

Answers: 1

Read More
HaggarTheHorrible
HaggarTheHorrible

Reputation: 7403

ARM Cortex-A8: Whats the difference between VFP and NEON

Score: 51

Views: 40531

Answers: 5

Read More
Mono
Mono

Reputation: 81

Accessing NEON intrinsics from Go

Score: 3

Views: 90

Answers: 1

Read More
PreviousPage 1Next