Reputation: 544
I have searched other threads on usage of L2 cache in CUDA. But, unable to find the solution. How do i make use of L2 Cache? Is there any invoking function or declaration for its use? Like, for using shared memory, we use __device__ __shared__
. Is there anything like that for L2 Cache??
Upvotes: 3
Views: 3278
Reputation: 11549
The L2 cache is transparent to device code. All accesses to memory (global, local, surface, texture, constant, and instruction) that do not hit in L1 go to L2. All writes go through L2.
CUDA C Programming Guide F.4.2 : Global Memory
This sections provides a few more details on L2.
The compiler flag -dlcm=cg can be used to make global accesses be uncached in L1 and cached in L2.
CUDA C Programming Guide B.5 : Memory Fence Functions
The function __threadfence() can be used to make sure that all writes to global memory are visible in L2.
The function __threadfence_system() can be used to make sure that all writes to global memory are visible to host threads.
Upvotes: 4