Reputation: 147
I want to use a maven package in a Databricks Job, which shall run on a new automated Cluster. Regular interactive clusters have the option to install a maven package. This installation resolves all dependencies of this package. On automated cluster you only can assign downloaded jars to be installed on startup of the cluster.
My problem is, that the dependencies of this jar are missing. Of course I can download them and add them to the cluster, but the dependency-tree seems to be pretty large. Can I just download a jar with all dependencies included (did not found one)? Or can I install my the package in another way?
The package I need is azure-eventhubs-spark.
Upvotes: 3
Views: 4934
Reputation: 147
Finally found I solution.
To append a maven package to a job (-cluster) you have to create the library in your workspace. On the start page of the Databricks-UI choose 'Import Library', then create maven-package you'd like. This package can be loaded as dependency in the Job settings.
Was kind of a obvious solution, but I never created a lib in databricks and therefore wasn't aware of this option.
Upvotes: 4