Reputation: 1544
There is a requirement to change the network of more that 1000 dataflow
jobs . Right now they are running in default network and we need to change the jobs to use custom/shared VPC
. I thought of using below gcloud dataflow
which supports --network parameter
to change network but it may not work for all the jobs .
gcloud dataflow jobs run wc --gcs-location gs://dataflow-templates-us-central1/latest/Word_Count --region us-central1 --subnetwork regions/us-east1/subnetworks/newkube --disable-public-ips
My main concern is if I change the network using the above command then it will invoke the dataflow
job too which means resources used by the job will be launched again . This is inflating the cost for me just to change the network .
Any suggestions to change the network for existing jobs without running the job so that during the next run it runs on the new network .
Upvotes: 0
Views: 643
Reputation: 2825
You don't need to run the job to change the network. Whatever Orchestration tool that you are currently using to trigger these dataflow jobs you would need to go and make changes to add the network
and subnetwork
(if required).
When the orchestration tool executes the dataflow job it will use the network
parameter to spin up the workers within that VPC. Each time the job runs it checks for these configurations and accordingly spins up the worker machine.
Upvotes: 1