Reputation: 697
I'm trying to use Modin on Databricks and getting this error
I've tried both pip install modin[all]
and pip install modin[ray]
Firstly, the installation takes 15 minutes, which is weird.
After installing, I'm doing
import modin.pandas as md
df = md.read_parquet('s3://path/to/file')
Getting this error
ModuleNotFoundError: No module named 'ray'
I have also tried setting os.environ["MODIN_ENGINE"] = "ray"
Upvotes: 0
Views: 6333
Reputation: 183
I followed the below steps to install Modin using Ray execution engine. Install Modin dependencies and Ray to run on Ray -
pip install modin[ray]
Also, please customize your Ray environment for use in Modin using the below commands.
import ray
ray.init()
import modin.pandas as pd
Please check out Intel Distribution of Modin (https://www.intel.com/content/www/us/en/developer/tools/oneapi/distribution-of-modin.html#gs.14j7r0) and Modin official page (https://modin.readthedocs.io/en/stable/) for installation issues and to accelerate pandas workflow on Intel architectures.
Upvotes: 1