Global memory details

Question

This is a follow up question to CUDA Global Memory, Where is it? In reference to GSmith's response. These Q's address the CC > 2.0 case.

When I lookup the spec's of my Nvida card, it lists 2GB of 'memory'. I've come to believe this is the 'Global' memory for this card. That is, this is GDDR3 memory that resides 'off-chip', but on the card. Would this be correct?

I don't see any case where the spec'd 'memory' is zero. Does one exist? That is, can I have a card w/ no off-chip memory? In that all my texture, local, and constant memory actually resides in pinned & mapped host memory.

Can I extend my global memory usage by pinning more than 2GB of host memory? Can I use all my off-chip global memory (2GB) and add (1GB) more global pinned memory? Or am I to understand that this card is only capable of an addressing space of 2GB max? i.e. I can only access 2GB of mem, unPinned, pinned, mapped, or any combo.

If the device is using pinned host memory (not mapped), do I need to Memcpy from dev to host? That is, the mem is physically on the host side. And it is being used by the device, so they can both see it. Why do I need to copy it to the host, when it is already there. It seems to be 'mapped' by default. (What mechanism is preventing this dual access?)

How does one go about mapping shared mem to global mem? (I'm not finding any mention of this in the doc's.) Is this a 'mapped' arrangement or do I still need to copy it from global to shared, and back again? (Could this save me a copy step?)

Robert Crovella · Accepted Answer

It's recommended to ask one question per question.

When I lookup the spec's of my Nvida card, it lists 2GB of 'memory'. I've come to believe this is the 'Global' memory for this card. That is, this is GDDR3 memory that resides 'off-chip', but on the card. Would this be correct?

Yes.

I don't see any case where the spec'd 'memory' is zero. Does one exist? That is, can I have a card w/ no off-chip memory? In that all my texture, local, and constant memory actually resides in pinned & mapped host memory.

The closest NVIDIA came to this idea was probably in the Ion 2 chipset. But there are no cuda-capable nvidia discrete graphics cards with zero on-board off-chip memory.

Can I extend my global memory usage by pinning more than 2GB of host memory?

You can pin more than 2GB of host memory. However this does not extend global memory. It does enable a variety of things such as improved host-device transfer rates, overlapped copy and compute, and zero-copy access of host memory from the GPU, but this is not the same as what you use global memory for. Zero-copy techniques perhaps come the closest to extending global memory onto host memory (conceptually) but zero copy is very slow from a GPU standpoint.

If the device is using pinned host memory (not mapped), do I need to Memcpy from dev to host?

Yes you still need to cudaMemcpy data back and forth.

That is, the mem is physically on the host side. And it is being used by the device

I don't know where this concept is coming from. Perhaps you are referring to zero-copy, but zero-copy is relatively slow compared to accessing data that is in global memory. It should be used judiciously in cases of small data sizes, and is by no means a straightforward way to provide a bulk increase to the effective size of the global memory on the card.

How does one go about mapping shared mem to global mem?

Shared memory is not automatically mapped to global memory. The methodology is to copy the data you need back and forth between shared and global memory.

Global memory details

Answers (1)

Related Questions