Rohit
Rohit

Reputation: 632

Not able to submit python application using spark submit

I generated an .egg file. Now I want to run my Spark application using spark-submit command on my local Windows. I have Spark version 2.1.1

spark-submit --py-files  local:///C:/git_local/sparkETL/dist/sparkETL-0.1-py3.6.egg driver.py

spark-submit --py-files  local:///C:/git_local/sparkETL/dist/sparkETL-0.1-py3.6.egg driver.py

This is code I'm trying but I'm getting error:

File not found(c:\spark\bin\driver.py)

Why spark-submit is trying to find file on local path when I already packaged it inside .egg? I read .egg files are similar to jar, so I assume like in case of jar file we pass class name to run spark-submit. Now I'm passing driver.py which is main file but it is not working.

Upvotes: 1

Views: 2206

Answers (1)

Duy Nguyen
Duy Nguyen

Reputation: 1015

spark-submit in this case pyspark always requires a python file to run (specifically driver.py), py-files are only libraries you want to attach to your spark job and are possibly used inside driver.py.

If you want to make it works, make sure driver.py exists in current location which you trigger spark-submit. Or change it to something like local:///C:/git_local/sparkETL/driver.py

Upvotes: 1

Related Questions