Reputation: 1
I tried to read BQ via scio with:
val tbleRows = sc.withName("Query BQ Table").bigQuerySelect(query)
or
val tbleRows = sc.withName("Query BQ Table").bigQueryStorage(query)
In both cases scio tries to create a staging dataset in BQ, but my service account don't have permission for that. I tried to use an existing BQ dataset via all these options:
sc.optionsAs[DataflowPipelineOptions].setStagingLocation("gs://[project-id]-temp/scio")
sc.optionsAs[BigQueryOptions].setTempDatasetId("[project-id].temp/scio")
sc.optionsAs[BigQueryOptions].setGcpTempLocation("gs://[project-id]-temp/scio")
sc.optionsAs[BigQueryOptions].setTempLocation("gs://[project-id]-temp/scio")
sc.options.setTempLocation("gs://[project-id]-temp/scio")
But all these option have no effect.
I want either uses an existing dataset or avoid using a staging dataset at all.
Upvotes: 0
Views: 36
Reputation: 1455
In my experience, the temp dataset is always left as the default value (cloud_dataflow
). Instead, the necessary permissions are granted to the service account to enable the job to read data from BigQuery e.g. roles/bigquery.dataEditor
role.
Upvotes: 0