StackOverflow Questions for Tag: neon

Steven Daniel Anderson
Steven Daniel Anderson

Reputation: 1443

How to Optimize 1024x1024 Matrix Multiplication in C to Match NumPy's Performance in M1 silicon

Score: 0

Views: 194

Answers: 0

Read More
Nidhoegger
Nidhoegger

Reputation: 5232

NEON: Optimize code

Score: 0

Views: 400

Answers: 1

Read More
Douglas B
Douglas B

Reputation: 832

Cannot compile simple program which uses ARM Neon for Cortex A53

Score: 0

Views: 470

Answers: 0

Read More
scoobydoo
scoobydoo

Reputation: 123

typecast float32 to int16 using arm neon intrinsics

Score: 0

Views: 328

Answers: 1

Read More
jcdmelo
jcdmelo

Reputation: 11

GCC error for ""vmull.u16 q7, d19, d8[0]" but not for ""vmull.u16 q7, d19, d7[0]"

Score: 1

Views: 47

Answers: 0

Read More
Pascal de Kloe
Pascal de Kloe

Reputation: 560

Bit scatter over multiple NEON registers

Score: 0

Views: 202

Answers: 1

Read More
Lazyloper
Lazyloper

Reputation: 41

What's the difference between LD3( multipule structure) and LD3 (single structure) in ARMv8-A structure

Score: 2

Views: 343

Answers: 1

Read More
Russell Newman
Russell Newman

Reputation: 83

SIMD bit reordering of packed 12-bit integer array

Score: 3

Views: 352

Answers: 2

Read More
Zz Tux
Zz Tux

Reputation: 658

What are the differences between `ld1`/`st1` and `ldr`/`str`, `ldp`/`stp` instructions when operating on one or two vector registers?

Score: 0

Views: 535

Answers: 0

Read More
jcdmelo
jcdmelo

Reputation: 1

ARM NEON: why is vector code slower than scalar?

Score: 0

Views: 579

Answers: 2

Read More
0xcaff
0xcaff

Reputation: 13681

Is there a difference between vst1.8 and vst1.32?

Score: 1

Views: 186

Answers: 1

Read More
Yates Zhang
Yates Zhang

Reputation: 1

Neon intrinsic vsubq_f32?

Score: 0

Views: 103

Answers: 0

Read More
ilp
ilp

Reputation: 21

ARM NEON Intrisics: Is using vmaxvq_s16() the fastest way to find max value in a int16x8 vector?

Score: 0

Views: 329

Answers: 1

Read More
Bogi
Bogi

Reputation: 2618

Transpose 4x4 int32 matrix using NEON

Score: 0

Views: 348

Answers: 1

Read More
J. Rehbein
J. Rehbein

Reputation: 117

Is it slow to branch on a memory offset incremented by vld1q in ARM NEON?

Score: 0

Views: 85

Answers: 1

Read More
Aki Suihkonen
Aki Suihkonen

Reputation: 20037

Using Horizontal Neon intrinsics efficiently

Score: 2

Views: 241

Answers: 1

Read More
maksdrv
maksdrv

Reputation: 3

ARM Assembly Vector addition

Score: 0

Views: 303

Answers: 1

Read More
spaghetti_boy
spaghetti_boy

Reputation: 1

Converting RGB to Grayscale with C ARM NEON

Score: 0

Views: 114

Answers: 0

Read More
TrentP
TrentP

Reputation: 4722

Efficient C vectors for generic SIMD (SSE, AVX, NEON) test for zero matches. (find FP max absolute value and index)

Score: 6

Views: 1500

Answers: 3

Read More
ilp
ilp

Reputation: 21

ARM NEON Saturating Doubling Multiply C/C++ Intrisic for signed 8-bit integers

Score: 2

Views: 158

Answers: 0

Read More
PreviousPage 4Next