Matjaž
Matjaž

Reputation: 2115

Apache Spark - ModuleNotFoundError: No module named 'mysql'

I'm trying to submit Apache Spark driver program to the remote cluster. I'm having difficulties with the python package called mysql. I installed this package on all Spark nodes. Cluster is running inside docker-compose, images are based on bde2020.

$ docker-compose logs  impressions-agg
impressions-agg_1  | Submit application /app/app.py to Spark master spark://spark-master:7077
impressions-agg_1  | Passing arguments 
impressions-agg_1  | 19/11/13 18:45:20 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
impressions-agg_1  | Traceback (most recent call last):
impressions-agg_1  |   File "/app/app.py", line 6, in <module>
impressions-agg_1  |     from mysql.connector import connect
impressions-agg_1  | ModuleNotFoundError: No module named 'mysql'
impressions-agg_1  | log4j:WARN No appenders could be found for logger (org.apache.spark.util.ShutdownHookManager).
impressions-agg_1  | log4j:WARN Please initialize the log4j system properly.
impressions-agg_1  | log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.

Module mysql is installed via pip on all nodes.

$ docker-compose exec spark-master pip list
Package         Version            
--------------- -------------------
mysql-connector 2.2.9              
pip             18.1               
setuptools      40.8.0.post20190503

$ docker-compose exec spark-worker pip list
Package         Version            
--------------- -------------------
mysql-connector 2.2.9              
pip             18.1               
setuptools      40.8.0.post20190503

How can I solve this? Thank you for any information.

Upvotes: 0

Views: 706

Answers (1)

tobygriffin
tobygriffin

Reputation: 5421

While the node has mysql installed, the container does not. What the logs are telling you is that impressions-agg_1 contains a script at /app/app.py which is trying to load mysql but cannot find it.

Did you create impressions-agg_1? Add a RUN pip install mysql step to its Dockerfile.

Upvotes: 1

Related Questions