Reputation: 6080
I am trying to read a huge csv file CUDF but gets memory issues.
import cudf
cudf.set_allocator("managed")
cudf.__version__
user_wine_rate_df = cudf.read_csv('myfile.csv',
sep = "\t",
parse_dates = ['created_at'])
'0.17.0a+382.gbd321d1e93'
terminate called after throwing an instance of 'thrust::system::system_error'
what(): parallel_for failed: cudaErrorIllegalAddress: an illegal memory access was encountered
Aborted (core dumped)
If I remove cudf.set_allocator("managed")
I get
MemoryError: std::bad_alloc: CUDA error at: /opt/conda/envs/rapids/include/rmm/mr/device/cuda_memory_resource.hpp:69: cudaErrorMemoryAllocation out of memory
I am using CUDF through rapidsai/rapidsai:cuda11.0-runtime-ubuntu16.04-py3.8
I wonder whar could be the reason of hitting memory, while I can read this big file with pandas
**Update
I installed dask_cudf
and used dask_cudf.read_csv('myfile.csv')
- but still get the
parallel_for failed: cudaErrorIllegalAddress: an illegal memory access was encountered
Upvotes: 1
Views: 3377
Reputation: 1291
Check out this blog by Nick Becker on reading larger than GPU memory files. It should get you on your way.
Upvotes: 1