Reputation: 21
I am running this import statement
from pyspark.ml.feature import VectorAssembler
And this is the full traceback:
ModuleNotFoundError Traceback (most recent call last)
Cell In[5], line 1
----> 1 from pyspark.ml.feature import VectorAssembler
File /Library/Frameworks/Python.framework/Versions/3.13/lib/python3.13/site-packages/pyspark/ml/__init__.py:22
1 #
2 # Licensed to the Apache Software Foundation (ASF) under one or more
3 # contributor license agreements. See the NOTICE file distributed with
(...)
15 # limitations under the License.
16 #
18 """
19 DataFrame-based machine learning APIs to let users quickly assemble and configure practical
20 machine learning pipelines.
21 """
Upvotes: 0
Views: 39
Reputation: 21
How to add Mlib library to Spark?
This solved my issue:
Try to do pip install numpy (or pip3 install numpy if that fails). The traceback says numpy module is not found.
Upvotes: 0
Reputation: 46
The traceback you posted states pyspark isn't in your python environment.
First, if you've had success with pyspark in jupyter lab before, make sure you're using that same kernel.
If that wasn't it, you can rule out a missing or corrupted pyspark installation with:
pip uninstall pyspark
pip install pyspark
If you're sure you have pyspark installed, consider installing and using findspark (pip install findspark
), which automatically finds and adds pyspark to your sys.path at runtime.
import findspark
findspark.init()
from pyspark.ml.feature import VectorAssembler
If all else fails, you can make a new python environment with pyspark in it. This process isn't unique to jupyter lab, any jupyter notebook tutorial will point you in the right direction. I'll assume you're on OS X based on the /Library/Frameworks
line in your traceback:
python -m venv MY_ENVIRONMENT_NAME
source MY_ENVIRONMENT_NAME/bin/activate
pip install pyspark
pip install ipykernel
python -m ipykernel install --user --name MY_KERNEL_NAME
Replace MY_ENVIRONMENT_NAME
and MY_KERNEL_NAME
as you like. After this, the new environment will show in your list of kernels. Select it and you're good to go.
Upvotes: 0