Reputation: 13500
Good day,
I am new to Google Cloud Storage and recently have been assigned with a task to write data on a GCS bucket. I've done this before for S3 but not sure how to do it with GCS. I have found some sample codes here and there (like the one in this link or this one), but none of them are what I need. What has been provided to me:
bucket_name = {
google_storage_hmac_access_id = “SOMEKEY”
google_storage_hmac_secret = “SOMEKEY”
}
The approach in first link requires a json file for credentials which is not what I have in hand. So I used the approach in second link and added to following to my code:
spark_context._jsc.hadoopConfiguration().set(
'fs.gs.impl', 'com.google.cloud.hadoop.fs.gcs.GoogleHadoopFileSystem'
)
# This is required if you are using service account and set true,
spark_context._jsc.hadoopConfiguration().set(
'fs.gs.auth.service.account.enable', 'false'
)
# Following are required if you are using oAuth
spark_context._jsc.hadoopConfiguration().set(
'fs.gs.auth.client.id', gcs_key
)
spark_context._jsc.hadoopConfiguration().set(
'fs.gs.auth.client.secret', gcs_secret
)
where gcs_key
and gcs_secret
, are those provided to me to connect to that bucket. And this is set to be my path:
gs://bucket_name
When I try this, it ends up opening a login page for me to give access to GCS using an email address which is clearly not the case as well. I am looking for a working example on how to read/write data from a GS bucket using those credentials.
Note1: I have using the same access_id and secret to set up gsutil
and everything seems to be working fine.
Note2: I have included required jar files in spark jars directory (gcs-connector-hadoop3-latest.jar
).
Upvotes: 0
Views: 1770
Reputation: 1253
As you can see here, most of the operations you perform in Cloud Storage must be authenticated (as read or write an object). Unless your objects are public, you must use authentication before perform an operation with an object/ bucket. You can choose between gsutil authentication, API authentication, Client library authentication or user account credentials.
Upvotes: 0