Dirk Gasser
Dirk Gasser

Reputation: 1

Creating staging dataset [project-id]:scio_bigquery_staging_europe_west3

I tried to read BQ via scio with: val tbleRows = sc.withName("Query BQ Table").bigQuerySelect(query) or val tbleRows = sc.withName("Query BQ Table").bigQueryStorage(query)

In both cases scio tries to create a staging dataset in BQ, but my service account don't have permission for that. I tried to use an existing BQ dataset via all these options:

sc.optionsAs[DataflowPipelineOptions].setStagingLocation("gs://[project-id]-temp/scio")
sc.optionsAs[BigQueryOptions].setTempDatasetId("[project-id].temp/scio")
sc.optionsAs[BigQueryOptions].setGcpTempLocation("gs://[project-id]-temp/scio")
sc.optionsAs[BigQueryOptions].setTempLocation("gs://[project-id]-temp/scio")
sc.options.setTempLocation("gs://[project-id]-temp/scio")

But all these option have no effect.

I want either uses an existing dataset or avoid using a staging dataset at all.

Upvotes: 0

Views: 36

Answers (1)

Chris
Chris

Reputation: 1455

In my experience, the temp dataset is always left as the default value (cloud_dataflow). Instead, the necessary permissions are granted to the service account to enable the job to read data from BigQuery e.g. roles/bigquery.dataEditor role.

Upvotes: 0

Related Questions