How can I read from the pinned (lock-page) RAM, and not from the CPU cache (use DMA zero-copy with GPU)?

Question

If I use DMA for RAM <-> GPU on CUDA C++, How can I be sure that the memory will be read from the pinned (lock-page) RAM, and not from the CPU cache?

After all, with DMA, the CPU does not know anything about the fact that someone changed the memory and about the need to synchronize the CPU (Cache<->RAM). And as far as I know, std :: memory_barier () from C + +11 does not help with DMA and will not read from RAM, but only will result in compliance between the caches L1/L2/L3. Furthermore, in general, then there is no protocol to resolution conflict between cache and RAM on CPU, but only sync protocols different levels of CPU-cache L1/L2/L3 and multi-CPUs in NUMA: MOESI / MESIF

ArchaeaSoftware · Accepted Answer

On x86, the CPU does snoop bus traffic, so this is not a concern. On Sandy Bridge class CPUs, the PCI Express bus controller is integrated into the CPU, so the CPU actually can service GPU reads from its L3 cache, or update its cache based on writes by the GPU.

How can I read from the pinned (lock-page) RAM, and not from the CPU cache (use DMA zero-copy with GPU)?

Answers (1)

Related Questions