Azure Synapse: Upload directory of py files in Spark job reference files

Question

I am trying to pass a whole directory of python files that are referenced in the main python file in Azure Synapse Spark Job Definition but the files are not appearing in the location and I get Module Not Found Error. Trying to upload like this:

abfss://[directory path in data lake]/*

Simon Ndunda · Accepted Answer

The way to achieve this on Synapse is to package your python files into a wheel package and upload the wheel package to a specific location the Azure Data Lake Storage where your spark pool will load them from every time it starts. This will make the custom python packages available to all jobs and notebooks using that spark pool.

You can find more details on the official documentation: https://learn.microsoft.com/en-us/azure/synapse-analytics/spark/apache-spark-manage-python-packages#install-wheel-files

Azure Synapse: Upload directory of py files in Spark job reference files

Answers (2)

Related Questions