Reputation: 582
I'm trying to connect my Spark Job which is running on private datacenter with BigQuery. I have created service account and got private JSON key and gained read access to the dataset I wanted to query for. But, when I try integrating with Spark, I'm receiving User does not have bigquery.tables.create permission for dataset xxx:yyy.
. Do we need create table permission to read data from table using BigQuery?
Below is the response gets printed on console,
{
"code" : 403,
"errors" : [ {
"domain" : "global",
"message" : "Access Denied: Dataset xxx:yyy: User does not have bigquery.tables.create permission for dataset xxx:yyy.",
"reason" : "accessDenied"
} ],
"message" : "Access Denied: Dataset xxx:yyy: User does not have bigquery.tables.create permission for dataset xxx:yyy.",
"status" : "PERMISSION_DENIED"
}
Below is my Spark code which I'm trying to access BigQuery
object ConnectionTester extends App {
val session = SparkSession.builder()
.appName("big-query-connector")
.config(getConf)
.getOrCreate()
session.read
.format("bigquery")
.option("viewsEnabled", true)
.load("xxx.yyy.table1")
.select("col1")
.show(2)
private def getConf : SparkConf = {
val sparkConf = new SparkConf
sparkConf.setAppName("biq-query-connector")
sparkConf.setMaster("local[*]")
sparkConf.set("parentProject", "my-gcp-project")
sparkConf.set("credentialsFile", "<path to my credentialsFile>")
sparkConf
}
}
Upvotes: 1
Views: 1643
Reputation: 30448
For reading regular tables there's no need for bigquery.tables.create
permission. However, the code sample you've provided hints that the table is actually a BigQuery view. BigQuery views are logical references, they are not materialized on the server side and in order for spark to read them they first need to be materialized to a temporary table. In order to create this temporary table bigquery.tables.create
permission is required.
Upvotes: 2
Reputation: 10382
Check below code.
Credential
val credentials = """
| {
| "type": "service_account",
| "project_id": "your project id",
| "private_key_id": "your private_key_id",
| "private_key": "-----BEGIN PRIVATE KEY-----\nxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx\n-----END PRIVATE KEY-----\n",
| "client_email": "[email protected]",
| "client_id": "111111111111111111111111111",
| "auth_uri": "https://accounts.google.com/o/oauth2/auth",
| "token_uri": "https://oauth2.googleapis.com/token",
| "auth_provider_x509_cert_url": "https://www.googleapis.com/oauth2/v1/certs",
| "client_x509_cert_url": "https://www.googleapis.com/robot/v1/metadata/x509/xxxxx40vvvvvv.iam.gserviceaccount.com"
| }
| """
Encode base64
& pass it to spark conf.
def base64(data: String) = {
import java.nio.charset.StandardCharsets
import java.util.Base64
Base64.getEncoder.encodeToString(data.getBytes(StandardCharsets.UTF_8))
}
spark.conf.set("credentials",base64(credentials))
spark
.read
.options("parentProject","parentProject")
.option("table","dataset.table")
.format("bigquery")
.load()
Upvotes: 0