Reputation: 672
I have noticed, there is an unanswered question about getting the weird response from azure databricks rest api 2.0 while trying to create a cluster.
error_code': 'INVALID_PARAMETER_VALUE', 'message': 'Missing required field: size'
Has anyone solved this issue? Is there a new API? Or some bugs in it?
I actually used an example from ms databricks documentation and I had to change several things, but I used
"autoscale": {
"min_workers": 2,
"max_workers": 8
}
I thought this weird error might be related to it, so I set num_workers, it also lead to the same issue.
Other changes I had to made
headers={"Authorization": "Bearer %s" % (TOKEN)}
otherwise there was a error abotu header. (response.content)
instead of trying to read error from json
response.json()["error_code"]
, it caused an error (I didnt go deep
into it, I just needed the message, what went wrong)Upvotes: 3
Views: 6528
Reputation: 672
Ok. I don't know if it is a valid approach, but this actually creates clusters pragmatically on my azure databricks resource. along with previous changes I have mentioned above I additionally just removed "new_cluster"
(see example here) and
response = requests.post(
'https://%s/api/2.0/clusters/create' % (DOMAIN),
headers={"Authorization": "Bearer %s" % (TOKEN)},
json={
"cluster_name": name.lower().strip(),
"spark_version": "6.2.x-scala2.11",
"node_type_id": "Standard_D3_v2",
"spark_env_vars": {
"PYSPARK_PYTHON": "/databricks/python3/bin/python3"
},
"spark_conf": {
"spark.databricks.cluster.profile": "serverless",
"spark.databricks.repl.allowedLanguages": "sql,python,r"
},
"autoscale": {
"min_workers": 2,
"max_workers": 8
},
"ssh_public_keys": [],
"autotermination_minutes":50
}
)
Upvotes: 2