Reputation: 507
I am trying to create dataproc cluster using cloud composer operators. Here is how my DAG looks like:
default_dag_args = {
'owner': 'airflow',
'depends_on_past': False,
'start_date': days_ago(1),
'email': ['****************'],
'email_on_failure': False,
'email_on_retry': False,
'retries': 3,
'retry_delay': timedelta(minutes=5),
}
CLUSTER_CONFIG = {
"master_config": {
"num_instances": 1,
"machine_type_uri": "n1-standard-4",
"disk_config": {"boot_disk_type": "pd-standard", "boot_disk_size_gb": 10},
},
"worker_config": {
"num_instances": 2,
"machine_type_uri": "n1-standard-4",
"disk_config": {"boot_disk_type": "pd-standard", "boot_disk_size_gb": 10},
},
}
with models.DAG(
'PanelSettings_dag',
schedule_interval="@daily",
default_args=default_dag_args) as dag:
t1 = BashOperator(
task_id='print_date',
bash_command='date',
)
create_cluster = DataprocCreateClusterOperator(
task_id="create_cluster",
gcp_conn_id='google-dataproc',
project_id=GCP_PROJECT_ID,
cluster_config=CLUSTER_CONFIG,
region=REGION,
cluster_name=CLUSTER_NAME,
)
I have created dataproc connection on airflow and given dataproc admin and storage admin roles to the service account. Without this connection thingy I was getting an error:
Getting connection using `google.auth.default()` since no key file is defined for hook.
Now I am getting error:
[2021-06-16 21:30:48,109] {taskinstance.py:1152} ERROR - 501 Received http2 header with status: 404
Traceback (most recent call last)
File "/opt/python3.6/lib/python3.6/site-packages/google/api_core/grpc_helpers.py", line 73, in error_remapped_callabl
return callable_(*args, **kwargs
File "/opt/python3.6/lib/python3.6/site-packages/grpc/_channel.py", line 923, in __call_
return _end_unary_response_blocking(state, call, False, None
File "/opt/python3.6/lib/python3.6/site-packages/grpc/_channel.py", line 826, in _end_unary_response_blockin
raise _InactiveRpcError(state
grpc._channel._InactiveRpcError: <_InactiveRpcError of RPC that terminated with
status = StatusCode.UNIMPLEMENTE
details = "Received http2 header with status: 404
debug_error_string = "{"created":"@1623879048.108981125","description":"Received http2 :status header with non-200 OK status","file":"src/core/ext/filters/http/client/http_client_filter.cc","file_line":129,"grpc_message":"Received http2 header with status: 404","grpc_status":12,"value":"404"}
I am new to airflow. Can someone help debug this. Not able to understand what am I doing wrong.
Upvotes: 2
Views: 896
Reputation: 507
The mistake was entering zone name in the region field. I corrected it and it worked. It would have been helpful had the error mentioned "region not found/not exists".
Upvotes: 2