user2052436
user2052436

Reputation: 4765

GPU (Nvidia) TLB misses

There are plenty of documentation/publications on CUDA/Nvidia GPUs, but I never encountered anything about TLBs.

  1. Do GPUs use TLBs similar to CPUs (and, therefore, have TLB hits/misses)?

  2. How are TLB misses handled? By CUDA driver or by GPU HW?

  3. Are there cases when TLB misses cause significant/noticeable performance impact?

Upvotes: 1

Views: 780

Answers (1)

Homer512
Homer512

Reputation: 13310

A TLB does exist. I am not aware of any official documentation but its size can be determined via reverse engineering. See for example Zhe Jia et.al.: Dissecting the NVidia Turing T4 GPU via Microbenchmarking

[…] within the available global memory size, there are two levels of TLB on the Turing GPUs. The L1 TLB has 2 MiB page entries and 32 MiB coverage. The coverage of the L2 TLB is about 8192 MiB, which is the same as Volta.

Upvotes: 2

Related Questions