Reputation: 141
On the /user/usr1/ path in HDFS, I placed two scripts pySparkScript.py and relatedModule.py. relatedModule.py is a python module which will be imported into pySparkScript.py.
I can run the scripts with spark-submit pySparkScript.py
However, I need to run these scripts through Livy. Normally, I run single scripts successfully as the following:
curl -H "Content-Type:application/json" -X POST -d '{"file": "/user/usr1/pySparkScript.py"}' livyNodeAddress/batches
However, when I run the above code, as soon as it gets to import relatedModule.py
it fails. I realize I should give the path to the relatedModule also in the parameters of Livy. I tried the following option:
curl -H "Content-Type:application/json" -X POST -d '{"file": "/user/usr1/pySparkScript.py", "files": ["/user/usr1/relatedModule.py"]}' livyNodeAddress/batches
How should I pass both files to Livy?
Upvotes: 0
Views: 885
Reputation: 991
Try to use pyFiles
property.
Please refer Livy REST API docs.
Upvotes: 1