naicolas
naicolas

Reputation: 148

Where is the texture cache located in a GPU?

I'm starting with OpenCL programming and learning about the differences between a texture buffer (also called image) and a regular buffer. From what I undersand, one of these differences is the fact that a texture fetch is cached, and with 2D locality.

The question is: where is this texture cache located? Is it shared across threads or is it just useful for accesses within a single thread?

For instance, consider this kernel:

__constant sampler_t sampler = CLK_NORMALIZED_COORDS_FALSE | CLK_ADDRESS_CLAMP_TO_EDGE | CLK_FILTER_NEAREST;

__kernel void myCoolKernel( __read_only image2d_t image, __global float * dst) {
        const int i = get_global_id(0);
        const int j = get_global_id(1);
        dst[i+j*get_global_size(0)] = read_imagef(image, sampler, (int2){i,j}).x;
}

Will nearby threads (on the same work-group I guess?) take advantage of the image cache? Or is it more useful for a kernel like the one below?

__kernel void myCoolKernel( __read_only image2d_t image, __global float * dst) {
        // Pray it's not a boundary
        const int i = get_global_id(0);
        const int j = get_global_id(1);
        float pixel1 = read_imagef(image, sampler, (int2){i+1,j}).x;
        float pixel2 = read_imagef(image, sampler, (int2){i,j-1}).x;
        float pixel3 = read_imagef(image, sampler, (int2){i-1,j}).x;
        float pixel4 = read_imagef(image, sampler, (int2){i,j+1}).x;
        dst[i+j*get_global_size(0)] = pixel1+pixel2+pixel3+pixel4;
}

I hope I made myself clear. Thanks everyone.

Upvotes: 1

Views: 982

Answers (2)

Simon F
Simon F

Reputation: 1055

It'll be GPU dependent. For example, some systems may have several "independent" shader "units", each of which will be running a subset of all available threads. Each shader unit will probably have its own L0 texture cache so that all threads assigned to that unit will be sharing that cache.

However, like a CPU, there will probably be a cache hierarchy such that there's an L1 cache that feeds the multiple shader unit L0s.

So to answer your question...

Will nearby threads (on the same work-group I guess?) take advantage of the image cache?

... yes, if the accesses of the threads as a set are coherent, then it will take advantage of the cache hierarchy.

FWIW there's a little more on texture caches usage on the computer graphics site

Upvotes: 1

Dithermaster
Dithermaster

Reputation: 6333

It is shared between threads. It exists because of graphics API textures (e.g., OpenGL, DirectX, Vulkan, etc.) but OpenCL images can use it too. It typically has 2D locality, probably due to Z-order storage. It is much faster than non-coalesced buffer access, but maybe not as fast as coalesced buffer access.

Upvotes: 2

Related Questions