Reputation: 1
Issue :
Trying to load a file (CSV and Parquet) using Dask CUDF and seeing some memory related errors. The dataset can easily fit into memory and the file can be read correctly using BlazingSQL's read_parquet method. However the dask_cudf.read_parquet() method fails to do the same. Seeing the same error with both file formats.
Other observation is that when a blazingSQL table is created from cudf dataframe , the table gets created but with zero records.
It will be helpful if someone can give any pointers to get over this issue.
Dataset info:
No of rows - 126 Million No of colums - 209 File Format – parquet No of Partitions - 8 File size parquet - 400 MB File size csv - 62 GB
System info :
GPU - 6 ( V100 TESLA) Memory - 16GB GPU Cores - 32 Cores
Client info: Scheduler: tcp://127.0.0.1:36617 Dashboard: http://127.0.0.1:8787/status Cluster Workers: 4 Cores: 4 Memory: 239.89 GiB
Code :
from blazingsql import BlazingContext
from dask.distributed import Client,wait
from dask_cuda import LocalCUDACluster
import dask
import dask_cudf
cluster = LocalCUDACluster()
client = Client(cluster)
bc = BlazingContext(dask_client=client)
ddf = dask_cudf.read_parquet('/home/ubuntu/126M_dataset/')
bc.create_table('table', ddf.compute())
Error Message:
super(NumericalColumn, col).fillna(fill_value, method)
501
502 def find_first_value(
~/miniconda3/lib/python3.7/site-packages/cudf/core/column/column.py in fillna(self, value, method, dtype)
733 """
734 return libcudf.replace.replace_nulls(
--> 735 input_col=self, replacement=value, method=method, dtype=dtype
736 )
737
cudf/_lib/replace.pyx in cudf._lib.replace.replace_nulls()
cudf/_lib/scalar.pyx in cudf._lib.scalar.as_device_scalar()
~/miniconda3/lib/python3.7/site-packages/cudf/core/scalar.py in device_value(self)
75 if self._device_value is None:
76 self._device_value = DeviceScalar(
---> 77 self._host_value, self._host_dtype
78 )
79 return self._device_value
cudf/_lib/scalar.pyx in cudf._lib.scalar.DeviceScalar.__init__()
cudf/_lib/scalar.pyx in cudf._lib.scalar.DeviceScalar._set_value()
cudf/_lib/scalar.pyx in cudf._lib.scalar._set_numeric_from_np_scalar()
MemoryError: std::bad_alloc: CUDA error at: /home/ubuntu/miniconda3/include/rmm/mr/device/cuda_memory_resource.hpp:69: cudaErrorMemoryAllocation out of memory
System Info :
nvidia-smi info:
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 465.19.01 Driver Version: 465.19.01 CUDA Version: 11.3 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 NVIDIA Tesla V1... On | 00000000:00:1B.0 Off | 0 |
| N/A 49C P0 55W / 300W | 16147MiB / 16160MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
| 1 NVIDIA Tesla V1... On | 00000000:00:1C.0 Off | 0 |
| N/A 48C P0 56W / 300W | 16106MiB / 16160MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
| 2 NVIDIA Tesla V1... On | 00000000:00:1D.0 Off | 0 |
| N/A 46C P0 61W / 300W | 16106MiB / 16160MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
| 3 NVIDIA Tesla V1... On | 00000000:00:1E.0 Off | 0 |
| N/A 48C P0 60W / 300W | 16106MiB / 16160MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| 0 N/A N/A 113949 C ...ntu/miniconda3/bin/python 823MiB |
| 0 N/A N/A 114055 C ...ntu/miniconda3/bin/python 15319MiB |
| 1 N/A N/A 114059 C ...ntu/miniconda3/bin/python 16101MiB |
| 2 N/A N/A 114062 C ...ntu/miniconda3/bin/python 16101MiB |
| 3 N/A N/A 114053 C ...ntu/miniconda3/bin/python 16101MiB |
+-----------------------------------------------------------------------------+
Upvotes: 0
Views: 592
Reputation: 4214
File size parquet - 400 MB File size csv - 62 GB GPU - 6 ( V100 TESLA) Memory - 16GB GPU Cores - 32 Cores
When you call compute
on a Dask collection, it fully computes the result and brings it into the Client process as a single-GPU object. Your data is likely overwhelming the 16GB of memory on one of your GPUs. You are likely looking for persist
, which fully computes the result and stores it in memory on the workers (note that the execution will happen in the background and persist
will return quickly).
Additionally, you also shouldn't need to persist your data before creating a BlazingSQL table from a Dask object.
You may find this answer, this blog post, and this documentation useful.
Upvotes: 1