Balaji
Balaji

Reputation: 111

Airflow DataprocClusterCreateOperator

In Airflow DataprocClusterCreateOperator settings:

Do we have a chance to set the Primary disk type for master and worker to pd-ssd?

The default setting is standard.

I was looking into the documentation - I don't find any parameters.

Upvotes: 0

Views: 607

Answers (2)

kaxil
kaxil

Reputation: 18824

Unfortunately, there is no option to change the Disk Type in DataprocClusterCreateOperator.

In Google API it is available if you pass a parameter to https://cloud.google.com/dataproc/docs/reference/rest/v1/projects.regions.clusters#diskconfig

I will try and add this feature and should be available in Airflow 1.10.1 or Airflow 2.0.

For now, you can create an Airflow plugin that modifies the current DataprocClusterCreateOperator.

Upvotes: 2

tobi6
tobi6

Reputation: 8239

There seem to be two fields in regard to this:

master_machine_type: Compute engine machine type to use for the master node
worker_machine_type: Compute engine machine type to use for the worker nodes

I have found this simply looking into the source code here (this is for latest, but no version was provided so I assumed the latest version):

https://airflow.readthedocs.io/en/latest/_modules/airflow/contrib/operators/dataproc_operator.html

Upvotes: 0

Related Questions