Bob B
Bob B

Reputation: 4614

Private cloud GPU virtualization similar to Amazon Web Services Cluster GPU instances

I am searching for options that enable dynamic cloud-based NVIDIA GPU virtualization similar to the way AWS assigns GPUs for Cluster GPU Instances.

My project is working on standing up an internal cloud. One requirement is the ability to allocate GPUs to virtual-machines/instances for server-side CUDA processing.

USC appears to be working on OpenStack enhancements to support this but it isn't ready yet. This would be exactly what I am looking for if it were fully functional in OpenStack.

NVIDIA VGX seems to only support allocation of GPUs to USMs, which is strictly remote-desktop GPU virtualization. If I am wrong, and VGX does enable server-side CUDA computing from virtual-machines/instances then please let me know.

Upvotes: 15

Views: 6099

Answers (2)

GregoryN
GregoryN

Reputation: 1

There is a solution called GPUBox that virtualizes the devices within CUDA. It can be used either on Amazon or your own infrastructure.

Quote from the website (http://renegatt.com/solutions.php):

The GPUBox software simplifies GPU management by separating the application and operating systems from the underlying GPU devices. It is a solution that allows the dynamic sharing of GPU devices from the same pool, by many users. (...)GPUBox enables on-demand provisioning of GPU devices to a physical or virtual machine with a Linux or Windows operating system. The pool of GPU devices is shared among users which leads to reduction in the total power consumption and idle-running hardware.

Upvotes: 0

BraveNewCurrency
BraveNewCurrency

Reputation: 13065

"dynamic cloud-based NVIDIA GPU virtualization similar to the way AWS assigns GPUs for Cluster GPU Instances."

AWS does not really allocate GPUs dynamically: Each GPU Cluster Compute has 2 fixed GPUs. All other servers (including the regular Cluster Compute) don't have any GPUs. I.e. they don't have an API where you can say "GPU or not", it's fixed to the box type, which uses fixed hardware.

The pass-thru mode on Xen was made specifically for your use case: Passing hardware on thru from the Host to the Guest. It's not 'dynamic' by default, but you could write some code that chooses one of the guests to get each card on the host.

Upvotes: 4

Related Questions