Reputation: 8098
We have been running DASK clusters on Kubernetes for some time. Up to now, we have been using CPUs for processing and, of course, system memory for storing our Dataframe of around 1,5 TB (per DASK cluster, split onto 960 workers). Now we want to update our algorithm to take advantage of GPUs. But it seems like the available memory on GPUs is not going to be enough for our needs, it will be a limiting factor(with our current setup, we are using more than 1GB of memory per virtual core).
I was wondering if it is possible to use GPUs (thinking about NVDIA, AMD cards with PCIe connections and their own VRAMS, not integrated GPUs that use system memory) for processing and system memory (not GPU memory/VRAM) for storing DASK Dataframes. I mean, is it technically possible? Have you ever tried something like this? Can I schedule a kubernetes pod such that it uses GPU cores and system memory together?
Another thing is, even if it was possible to allocate the system RAM as VRAM of GPU, is there a limitation to the size of this allocatable system RAM?
Note 1. I know that using system RAM with GPU (if it was possible) will create an unnecessary traffic through PCIe bus, and will result in a degraded performance, but I would still need to test this configuration with real data.
Note 2. GPUs are fast because they have many simple cores to perform simple tasks at the same time/in parallel. If an individual GPU core is not superior to an individual CPU core then may be I am chasing the wrong dream? I am already running dask workers on kubernetes which already have access to hundreds of CPU cores. In the end, having a huge number of workers with a part of my data won't mean better performance (increased shuffling). No use infinitely increasing the number of cores.
Note 3. We are mostly manipulating python objects and doing math calculations using calls to .so libraries implemented in C++.
Edit1: DASK-CUDA library seems to support spilling from GPU memory to host memory but spilling is not what I am after.
Edit2: It is very frustrating that most of the components needed to utilize GPUs on Kubernetes are still experimental/beta.
Upvotes: 2
Views: 381
Reputation: 794
I don't think this is possible directly as of today, but it's useful to mention why and reply to some of the points you've raised:
dask-cuda
is what comes to mind first when I think of your use-case. The docs do say it's experimental, but from what I gather, the team has plans to continue to support and improve it. :)dask-cuda
's spilling mechanism was designed that way for a reason -- while doing GPU compute, your biggest bottleneck is data-transfer (as you have also noted), so we want to keep as much data on GPU-memory as possible by design.I'd encourage you to open a topic on Dask's Discourse forum, where we can reach out to some NVIDIA developers who can help confirm. :)
A sidenote, there are some ongoing discussion around improving how Dask manages GPU resources. That's in its early stages, but we may see cool new features in the coming months!
Upvotes: 3