einpoklum
einpoklum

Reputation: 131546

What are the access times for different GPU memory spaces?

This is a question about discrete GPUs, mostly recent GPUs (NVIDIA Kepler, Maxwell; and whatever's in AMD Kaveri and R290's).

How much does it take to load an otherwise-uncached element into a register from...

A link to a table somewhere would be great, an explanation would be ok...

Upvotes: 5

Views: 2462

Answers (1)

SunsetQuest
SunsetQuest

Reputation: 8827

It varies on gpu, generation, how its integrated(like pcie) and other things. I work with ASM often and these are numbers that I work with:

-Global device memory? around 300-800 clocks. (motherboard mounted GPUs like laptops that use main memory have slower memory)

-Global memory L2 cache? around 100 clock cycles

-Texture cache(s)? guessing 50-100 clock cycles

-Constant cache(s)? around 1-3 clock cycles if it is in the cache or else L2 cache (~50-100 clocks) or even global mem 300-500 clocks. (depending on if it is a cache hit or miss)

-Per-core (i.e. Per-SMX/SMM in Kepler/Maxwell) L1 cache? around 1-3 clock cycles

-Per-core (i.e. Per-SMX/SMM in Kepler/Maxwell) shared memory? around 1-3 clock cycles

I also did some online searches to see how close I was and found this. The numbers are different then mine. http://lpgpu.org/wp/wp-content/uploads/2013/05/poster_andresch_acaces2014.pdf I think the actual time it takes vs what the programmer should be working with are two different numbers because of the multi-threading. Hope this helps.

Upvotes: 4

Related Questions