KVK
KVK

Reputation: 139

Google Dataflow Templates - passing NumWorkers, MaxNumWorkers, WorkerMachineType, AutoscalingAlgorithm

I am using Google Cloud Dataflow Java SDK 2.1.0.

I am able to create a template using --templateLocation pipeline option along with some other ValueProvider pipeline options which I would like to pass at run time. I am invoking the template using Google dataflow API from within a cloud function.

The template worked OK with parameters stagingLocation, tempLocation and my custom parameters, but when I pass the parameters for numWorkers, maxNumWorkers, workerMachineType, autoscalingAlgorithm in the API call, I got "Found unexpected parameters" error for these. I read another stackoverflow post and created another template with these DataflowPipelineOptions referred to in Log statements like below:

LOG.info("Number of workers => " + String.valueOf(options.getNumWorkers()));

The new template did accept the default pipeline options mentioned above. So it looks like the templates are not carrying pipeline options that are not referred to in the pipeline code - which is OK for custom options, but I suppose all the default pipeline options should always be there in any template.

Can anyone confirm if this is expected behavior or let me know if I am not doing it right?

Upvotes: 0

Views: 1902

Answers (1)

gvk
gvk

Reputation: 66

To correct you, it's not possible to specify environment parameters in the way you tried.

'maxWorkers', 'zone', 'tempLocation', 'machineType', 'network', and 'subnetwork' can be specified as part of environment. See Example 1 in Cloud Dataflow documentation for the way to specify them in a request.

Upvotes: 4

Related Questions