Reputation: 1428
I'm asking this because I know there's a way to use binary files instead of source files.
Also, I'm guessing that with an assembly language, it would be easier to simulate function pointers. Unless the assembly on a GPU is totally different from the one on a CPU.
Upvotes: 28
Views: 21289
Reputation: 247999
Yes, the assembly on a GPU is totally different from that of a CPU. One of the differences is that the instruction set for a GPU is not standardized. NVidia (and AMD and other GPU vendors) can and do change their instruction set from one GPU model to the next.
So CUDA does not expose an assembly language. There'd be no point. (And the limitations in CUDA's C dialect, and whatever other languages they support, are there because of limitations in the GPU hardware, not just because Nvidia hates you and wants to annoy you. So even if you had direct access to the underlying instruction set and assembly language, you wouldn't be able to magically do things you can't do now.
(Note that there's NVidia does define a "virtual" instruction set that you can use and embed in your code. But it's not the instruction set, and it doesn't map directly to the hardware instructions. It's little more than a simpler programming language which "looks like" a dialect of assembly
Upvotes: 18
Reputation: 3423
There are in fact two different CUDA assembly languages.
PTX is a machine-independent assembly language that is compiled down to SASS, the actual opcodes executed on a particular GPU family. If you build .cubins, you're dealing with SASS. Most CUDA runtime applications use PTX, since this enables them to run on GPUs released after the original application.
Also, function pointers have been in CUDA for a while if you're targeting sm_20 (Fermi/GTX 400 series).
Upvotes: 24
Reputation: 28302
You might want to take a look at PTX. NVIDIA provides a document describing it in the CUDA 4.0 documentation.
http://developer.nvidia.com/nvidia-gpu-computing-documentation
NVIDIA describes PTX as "Ta low-level parallel thread execution virtual machine and instruction set architecture (ISA). PTX exposes the GPU as a data-parallel computing device." Not exactly like x86 assembly, but you might find it interesting reading.
Upvotes: 37