Reputation: 1
I am using Dataflow flex templates and am trying to launch as a job with GPU. I am following docs here to build my template from base nvidia image: https://cloud.google.com/dataflow/docs/guides/using-gpus
I want to run the template with a GPU attached. This requires experiments:
"worker_accelerator=type:nvidia-tesla-t4l;count:1;install-nvidia-driver
"use_runner_v2"
I believe these need to be specified separately rather than as a list. Or at least I haven't found a way to do that and the docs specify two --experiment arguments. From looking at docs I also believe that I need to specify the experiments as part of the --parameters
argument as there is no --experiments
argument for running flex templates.
I have tried the following:
In gcloud command line:
Specified experiments
argument twice under --parameters
. In this case it only assigns the second specified experiment.
gcloud dataflow flex-template run "test-flex" --template-file-gcs-location=<LOCATION> --parameters=source="bigquery",experiments="worker_accelerator=type:nvidia-tesla-t4;count:1;install-nvidia-driver",experiments="use_runner_v2" --max-workers=1 --region=us-east4 --worker-zone=us-east4-b
Assigned experiments
in --parameters
to specify the GPU and set --additional-experiments
to use_runner_v2
.
gcloud dataflow flex-template run "test-flex" --template-file-gcs-location=<LOCATION> --parameters=source="bigquery",experiments="worker_accelerator=type:nvidia-tesla-t4;count:1;install-nvidia-driver" --max-workers=1 --region=us-east4 --worker-zone=us-east4-b --additional-experiments='use_runner_v2'
This caused an error:
ERROR: (gcloud.dataflow.flex-template.run) INVALID_ARGUMENT: The template parameters are invalid. Details: experiments: Runtime parameter experiments should not be specified in both parameters field and environment field.
I can get each experiment to work separately but cannot get them to both work. Is there a simple fix for this? I haven't been able to find anything in the documentation nor figured it out myself.
Please let me know any additional information you need me to provide.
Upvotes: 0
Views: 1922
Reputation: 1383
You can pass both experiments in additional-experiments:
--additional-experiments=worker_accelerator=type:nvidia-tesla-t4;count:1;install-nvidia-driver,use_runner_v2
--additional-experiments=worker_accelerator=type:nvidia-tesla-t4;count:1;install-nvidia-driver \
--additional-experiments=use_runner_v2
should both work.
Upvotes: 1