DarioB
DarioB

Reputation: 1609

How to run Dataflow from python Google API Client Libraries on private subnetwork

I am trying to launch a Dataflow job using the python google api client libraries. Everything worked fine previously, until we had to migrate from default subnetwork to another private subnetwork. Previously I was launching a dataflow job with the following code:

    request = dataflow.projects().locations().templates().launch(
        projectId = PROJECT_ID,
        location  = REGION,
        gcsPath   = TEMPLATE_LOCATION,
        body      = {
            'jobName':    job_name,
            'parameters': job_parameters,
        }
    )
    response = request.execute()

However the job now will fail because the default subnetwork does not exist anymore, and I now need to specify to use data-subnet subnetwork. From this documentation and also this other question, the solution would be trivial if i were to launch the script from command line by adding the flag --subnetwork regions/$REGION/subnetworks/$PRIVATESUBNET. However my case is different becuase I am trying to do it from code, and in the documentation I can't find any subnet parameter option.

Upvotes: 0

Views: 363

Answers (1)

theterminalguy
theterminalguy

Reputation: 1941

You can specify a custom subnetwork like so to your pipeline

    request = dataflow.projects().locations().templates().launch(
        projectId = PROJECT_ID,
        location  = REGION,
        gcsPath   = TEMPLATE_LOCATION,
        body      = {
            'jobName':    job_name,
            'parameters': job_parameters,
            'environment': {
                'subnetwork': SUBNETWORK,
            }
        }
    )
    response = request.execute()

Make sure SUBNETWORK is in the form "https://www.googleapis.com/compute/v1/projects/<project-id>/regions/<region>/subnetworks/<subnetwork-name>"

Upvotes: 1

Related Questions