sachin
sachin

Reputation: 1360

How to package separate dependencies for driver and executor in pyspark?

I am looking various approaches for pyspark package management. I went through https://spark.apache.org/docs/latest/api/python/user_guide/python_packaging.html . As per my understanding, the zip file will be downloaded both in driver and executors via all methods. I am wondering if it is possible to specify certain packages to be only in driver and not in executor? Is my understanding wrong?

My use case is that i need some packages for at driver end only. These package might be having some size. The same package wont be used at executor at all. I did not see driver/executor classpath java sort of approaches in pyspark. Can you reccomend some best practices regarding pyspark dependency management?

Thank you.

Upvotes: 0

Views: 54

Answers (0)

Related Questions