Rolintocour
Rolintocour

Reputation: 3168

GCP Dataproc: create cluster with stackdriver activated

Using GCP, I instantiate workflows for my processing. I'd like to activate Stackdriver logging to have more metrics (see https://cloud.google.com/dataproc/docs/guides/stackdriver-logging).

From documentation, I should set the property:

dataproc:dataproc.logging.stackdriver.job.driver.enable=true

My workflow template looks like:

placement:
  managedCluster:
    clusterName: my-cluster
    config:
      gceClusterConfig:
        zoneUri: europe-west1-d
      masterConfig:
        machineTypeUri: n1-standard-4
      workerConfig:
        machineTypeUri: n1-standard-4
        numInstances: 10

Where should I set this property?

Thx.

Upvotes: 3

Views: 321

Answers (1)

tix
tix

Reputation: 2158

The below should work.

Since the API hierarchy is deeply nested, you can build the initial template using gcloud dataproc workflow-templates interface, describe command will give you the correct YAML or JSON. You can then do fast iteration using instantiate-inline from the local file.

placement:
  managedCluster:
    clusterName: my-cluster
    config:
      gceClusterConfig:
        zoneUri: europe-west1-d
      masterConfig:
        machineTypeUri: n1-standard-4
      workerConfig:
        machineTypeUri: n1-standard-4
        numInstances: 10
      softwareConfig:
        properties:
          dataproc:dataproc.logging.stackdriver.job.driver.enable: true    

Upvotes: 4

Related Questions