Baiqing
Baiqing

Reputation: 1343

Specify specific worker configuration in GCP Dataflow

Is it possible to specify the configurations I want for a single dataflow worker? It seems like the default contains 4 cores and 15GB of memory, which is more than enough. How can I down size it or is this the smallest unit of a worker being offered?

Upvotes: 0

Views: 819

Answers (1)

hwu76
hwu76

Reputation: 150

Per the Workers section of the Dataflow > How-to Guides > Deploying a Pipeline page, you can specify a custom machine type (with different cores or memory) with the --worker_machine_type option.

You can also see the other Dataflow worker-related options in the docs/source code for the WorkerOptions class, which parses the various worker-related command-line options. Some of the other options listed here include: disk_size_gb, worker_disk_type.

Tangentially related: Additional GCP-related Dataflow options are handled by the GoogleCloudOptions class.

Upvotes: 1

Related Questions