Reputation: 2617
I am working on an ELT using workflows. So far very good. However, one of my tables is based on a Google sheet and that job fails on "Access Denied: BigQuery BigQuery: Permission denied while getting Drive credentials."
I know I need to add the https://www.googleapis.com/auth/drive scope to the request and the service account that is used by the workflow needs access to the sheet. The access is correct and if I do an authenticated insert using curl it works fine.
My logic is that I should add the drive scope. However I do not know where/how to add it. Am I missing something?
The step in the Workflow:
call: googleapis.bigquery.v2.jobs.insert
args:
projectId: ${sys.get_env("GOOGLE_CLOUD_PROJECT_ID")}
body:
configuration:
query:
query: select * from `*****.domains_sheet_view`
destinationTable:
projectId: ${sys.get_env("GOOGLE_CLOUD_PROJECT_ID")}
datasetId: ***
tableId: domains
create_disposition: CREATE_IF_NEEDED
write_disposition: WRITE_TRUNCATE
allowLargeResults: true
useLegacySql: false```
Upvotes: 0
Views: 1660
Reputation: 166
Option 1, http.post + googleapis.bigquery.v2.jobs.query
call: http.post
args:
url: ${"https://bigquery.googleapis.com/bigquery/v2/projects/"+project+"/queries"}
headers:
Content-type: "application/json"
auth:
type: OAuth2
scope: ["https://www.googleapis.com/auth/drive","https://www.googleapis.com/auth/cloud-platform","https://www.googleapis.com/auth/bigquery"]
body:
query: select * from sheets.sheets_data
timeoutMs: 200000
useLegacySql: false
result: response
Option 2, Scheduled query + googleapis.bigquerydatatransfer.v1
call: googleapis.bigquerydatatransfer.v1.projects.locations.transferConfigs.startManualRuns
args:
parent: ${scheduled_query_name}
body:
requestedRunTime: ${time.format(sys.now())}
result: response
Upvotes: 1
Reputation: 208012
AFAIK for connectors, you cannot customize the scope
parameter but you can customize if you put together the HTTP
call yourself.
here is my program
#workflow entrypoint
main:
steps:
- initialize:
assign:
- project: ${sys.get_env("GOOGLE_CLOUD_PROJECT_ID")}
- makeBQJob:
call: BQJobsInsertJobWithSheets
args:
project: ${project}
configuration:
query:
query: SELECT * FROM `ndc.autoritati_publice` LIMIT 10
destinationTable:
projectId: ${project}
datasetId: ndc
tableId: autoritati_destination
create_disposition: CREATE_IF_NEEDED
write_disposition: WRITE_TRUNCATE
allowLargeResults: true
useLegacySql: false
result: res
- final:
return: ${res}
#subworkflow definitions
BQJobsInsertJobWithSheets:
params: [project, configuration]
steps:
- runJob:
try:
call: http.post
args:
url: ${"https://bigquery.googleapis.com/bigquery/v2/projects/"+project+"/jobs"}
headers:
Content-type: "application/json"
auth:
type: OAuth2
scope: ["https://www.googleapis.com/auth/drive","https://www.googleapis.com/auth/cloud-platform","https://www.googleapis.com/auth/bigquery"]
body:
configuration: ${configuration}
result: queryResult
except:
as: e
steps:
- UnhandledException:
raise: ${e}
next: queryCompleted
- pageNotFound:
return: "Page not found."
- authError:
return: "Authentication error."
- queryCompleted:
return: ${queryResult.body}
Upvotes: 3