StackOverflow Questions for Tag: gpu-warp

ctrlaltdel
ctrlaltdel

Reputation: 5

Warp Reduce primitives to threads sharing same value

Score: 0

Views: 69

Answers: 1

Read More
xwt1
xwt1

Reputation: 15

Does __shfl_sync in CUDA always operate on registers, or does it involve shared memory or global memory in some certain situations?

Score: 0

Views: 132

Answers: 1

Read More
Paul
Paul

Reputation: 515

Understanting thread utilization in the CUDA reduction examples

Score: 2

Views: 64

Answers: 1

Read More
Serge Rogatch
Serge Rogatch

Reputation: 15090

Pre 8.x equivalent of __reduce_max_sync() in CUDA

Score: 1

Views: 374

Answers: 2

Read More
Johan
Johan

Reputation: 76724

How to partition data in a warp based on a predicate so all keep items are consecutive

Score: 0

Views: 46

Answers: 1

Read More
gonidelis
gonidelis

Reputation: 1063

What is warp shuffling in CUDA and why is it useful?

Score: 6

Views: 5156

Answers: 1

Read More
Regis Portalez
Regis Portalez

Reputation: 4860

Thread/warp local lock in cuda

Score: 0

Views: 1147

Answers: 1

Read More
Armin Rigo
Armin Rigo

Reputation: 12990

On today's GPUs, can warps be recombined dynamically?

Score: 1

Views: 54

Answers: 0

Read More
pem
pem

Reputation: 465

Compute per-warp histogram without shared memory

Score: 2

Views: 172

Answers: 2

Read More
SnowSR
SnowSR

Reputation: 3

CUDA __shfl_down_sync does not work with __match_any_sync

Score: -1

Views: 582

Answers: 1

Read More
Fabio T.
Fabio T.

Reputation: 129

__activemask() vs __ballot_sync()

Score: 9

Views: 5681

Answers: 1

Read More
einpoklum
einpoklum

Reputation: 132108

How do I do the converse of shfl.idx (i.e. warp scatter instead of warp gather)?

Score: 2

Views: 364

Answers: 2

Read More
nanofarad
nanofarad

Reputation: 41281

Why is my CUDA warp shuffle sum using the wrong offset for one shuffle step?

Score: 4

Views: 787

Answers: 1

Read More
Timocafé
Timocafé

Reputation: 765

Warp shuffling for CUDA

Score: 1

Views: 4007

Answers: 3

Read More
einpoklum
einpoklum

Reputation: 132108

Are threads in a multi-dimensional CUDA kernel blocks packed to fill warps?

Score: 1

Views: 685

Answers: 1

Read More
Silicomancer
Silicomancer

Reputation: 9196

Monitor active warps and threads during a divergent CUDA run

Score: 0

Views: 692

Answers: 2

Read More
sg_man
sg_man

Reputation: 833

In CUDA, how can I get this warp's thread mask in conditionally executed code (in order to execute e.g., __shfl_sync or <cg>.shfl?

Score: 0

Views: 954

Answers: 1

Read More
Gabriel
Gabriel

Reputation: 9442

How are 2D / 3D CUDA blocks divided into warps?

Score: 21

Views: 7327

Answers: 2

Read More
Johan
Johan

Reputation: 76724

What's the alternative for __match_any_sync on compute capability 6?

Score: 4

Views: 1293

Answers: 1

Read More
user6671981
user6671981

Reputation:

Why use thread blocks larger than the number of cores per multiprocessor

Score: 0

Views: 335

Answers: 1

Read More
PreviousPage 1Next