Reputation: 43
I've used com.azure.cosmos.spark:azure-cosmos-spark_3-1_2-12:4.0.0 installed on cluster with runtime 8.3.x-scala2.12 for a long time. But it suddenly stopped working and databricks jobs that are run on cluster with this library are canceled.
Cluster driver logs stderr file contains following error: ANTLR Tool version 4.7 used for code generation does not match the current runtime version 4.8
I've tried to update library and cluster runtime versions and also installed jar library instead of maven, but it didn't help.
Now my cluster has following configuration:
{
"autoscale": {
"min_workers": 1,
"max_workers": 2
},
"cluster_name": "test-clstr002",
"spark_version": "9.1.x-scala2.12",
"spark_conf": {
"spark.databricks.delta.preview.enabled": "true"
},
"azure_attributes": {
"first_on_demand": 1,
"availability": "ON_DEMAND_AZURE",
"spot_bid_max_price": -1
},
"node_type_id": "Standard_F4s",
"driver_node_type_id": "Standard_F4s",
"ssh_public_keys": [],
"custom_tags": {},
"spark_env_vars": {},
"autotermination_minutes": 60,
"enable_elastic_disk": true,
"cluster_source": "API",
"init_scripts": [],
}
There is a screenshot of installed azure-cosmos-spark maven library
Thank you for any help or suggestions!
Upvotes: 1
Views: 294
Reputation: 2764
When you install a conflicting version of a library the cluster returns cancelled
in a Python notebook, Maven dependency to your Spark cluster. Your app should be able to use the required connector libraries. But currently, if you specify the Cosmos DB-Spark connector’s Maven coordinates as a dependency for the cluster, you will get the error.
Solution:
Reference:
https://learn.microsoft.com/en-us/azure/databricks/data/data sources/azure/cosmosdb-connector
Upvotes: 0