Pierre-Alexandre
Pierre-Alexandre

Reputation: 775

GCP workflow: load external sql file?

I am planning to have a Cloud Scheduler that calls a GCP Workflows every day at 8 a.m. My GCP Workflows will have around 15 different steps and will be only transformations (update, delete, add) on BigQuery. Some queries will be quite long and I am wondering if there is a way to load a .sql file into a GCP Workflows task1.yaml?

#workflow entrypoint
ProcessItem:
  params: [project, gcsPath]
  steps:
    - initialize:
        assign:
          - dataset: wf_samples
          - input: ${gcsPath}
          - sqlQuery: QUERY HERE
   ...

Upvotes: 1

Views: 1596

Answers (2)

Brandon Reid
Brandon Reid

Reputation: 1

For a more updated method based on @Pentium10's answer with storing the SQL in GCS. You can simply use the Storage Workflow API available for Workflows.

    - download_sql:
        call: googleapis.storage.v1.objects.get
        args:
          bucket: {your_bucket_name}
          object: {your_file_name}.sql
          alt: "media" # needed to retrieve the content of the file
        result: object_data
    - run_sql:
        call: googleapis.bigquery.v2.jobs.query
        args:
          projectId: ${project_id}
          body:
            query: '${text.decode(object_data)}'
        result: query_response

In this case you will need to use text.decode() to convert the file content from bytes to a string.

Upvotes: 0

Pentium10
Pentium10

Reputation: 208012

You need to do something similar: (of course you can assign this to a variable like input)

#workflow entrypoint
main:
  steps:
    - getSqlfile:
        call: http.get
        args:
          url: https://raw.githubusercontent.com/jisaw/sqlzoo-solutions/master/select-in-select.sql
          headers:
            Content-Type: "text/plain"
        result: queryFromFile
    - final:
        return: ${queryFromFile.body}

For Cloud Storage that may look like:

 call: http.get
    args:
      url: https://storage.cloud.google.com/................./q1.sql
      headers:
        Content-Type: "text/plain"
      auth:
        type: OIDC
    result: queryFromFile

Or event with this format (different URL syntax + OAuth2)

call: http.get
    args:
      url: https://storage.googleapis.com/................./q1.sql
      headers:
        Content-Type: "text/plain"
      auth:
        type: OAuth2
    result: queryFromFile

Make sure that invoker has the right permission to access the Cloud Storage file.

Note: On further testing, this to work correctly the text/plain mime-type must be set on the GCS file.

Upvotes: 5

Related Questions