niCk cAMel
niCk cAMel

Reputation: 909

How to access gcloud bucket storage from script running in a Cloud Run job

I'm able to create an image from sources and create and run a Cloud Run job through the Google CLI.

gcloud builds submit --pack image=gcr.io/<my-project-id>/logger-job
gcloud beta run jobs create helloworld --image gcr.io/<my-project-id>/logger-job --region europe-west9
gcloud beta run jobs execute helloworld --region=europe-west9

From the CLI, I can also upload files to a bucket by running be python script:

import sys
from google.cloud import storage

def get_bucket(bucket_name):
    storage_client = storage.Client.from_service_account_json(
        'my_credentials_file.json')

    # Make an authenticated API request
    buckets = list(storage_client.list_buckets())
    
    found = False
    for bucket in buckets:
        if bucket_name in bucket.name:
            found = True
            break

    if not found:
        print("Error: bucket not found")
        sys.exit(1)

    print(bucket)

    return bucket

def upload_file(bucket, fname):
    destination_blob_name = fname
    source_file_name = fname

    blob = bucket.blob(destination_blob_name)

    blob.upload_from_filename(source_file_name)

    print(
        f"File {source_file_name} uploaded to {destination_blob_name}."
    )

if __name__ == "__main__":
    bucket = get_bucket('MyBucket')
    upload_file(bucket, 'testfile')

Before I can even figure out how the authentication towards the bucket works when running this script through a job, I'm getting errors on target

ModuleNotFoundError: No module named 'google'

Which kind of makes sense, but I don't know how I would include the google.cloud module when running the script on target job? Or should I access the bucket in another way?

Upvotes: 0

Views: 1221

Answers (2)

niCk cAMel
niCk cAMel

Reputation: 909

Place a requirements.txt file, containing the 3rd party modules that are necessary for running the code, in the folder that is being built when executing

gcloud builds submit --pack image=gcr.io/<my-project-id>/logger-job

This will automatically pip install the modules when exeuting the job.

Upvotes: 0

DazWilkin
DazWilkin

Reputation: 40136

There are multiple questions and you would benefit from reading Google's documentation.

When (!) the Python code that interacts with Cloud Storage is combined into the Cloud Run deployment, it will run using the identity of its Cloud Run service. Unless otherwise specified, this will be the default Cloud Run Service Account. You should create and reference a Service Account specific to this Cloud Run service.

You will need to determine which IAM permissions you wish to confer on the Service Account. Generally you'll reference a set of permissions using a Cloud Storage IAM role.

You're (correctly) using a Google-provided client library to interact with Cloud Storage. You will be able to benefit from Application Default Credentials. This makes it easy to run e.g. your Python code on e.g. Cloud Run as a Service Account.

Lastly, you should review Google's Cloud Storage client library (for Python) documentation to see how to use e.g. pip to install the client library.

Upvotes: 1

Related Questions