Reputation: 25
The context is that I need to end the process if a file, stored in Google Cloud Platform is empty, but if isn´t empty, follow the normal workflow. I´m doing this with a branch operator in airflow, but i have to pass a condition to decide if the process needs to end there, or continue.
So my question is: how can I get the size of a flat file stored in a bucket in GCP?
Thanks in advance!
Upvotes: 2
Views: 2597
Reputation: 4051
You can use the Blobs/Objects builtin functions inside the Google Cloud Storage Library for Python.
In order to check if a file is inside your bucket and its size is greater than zero, I have created the following code:
from google.cloud.storage import Blob
from google.cloud import storage
client = storage.Client()
bucket = client.bucket('bucket_name')
desired_file = "file_name.csv"
for blob in bucket.list_blobs():
if desired_file== blob.name and blob.size > 0:
print("Name: "+ blob.name +" Size blob obj: "+str(blob.size) + "bytes")
#do something
Above, the list_blobs() method was used to list all the files inside the specified bucket. Then, we used blob.name
to retrieve the file's name and blob.size
in order to return the file's size in BYTES. After this small chunk of code you can continue your tasks.
Additional information: It is also possible to filter the files you will list with a prefix, in case there are a huge amount, such as for blob in client_bucket.bucket('bucket_name') .list_blobs(prefix='test_'):
UPDATE:
In order to give more fine grained permissions, regarding specific buckets and objects, you can use Access Control Lists. It allows you to define access to particular buckets and objects,according to a desired access level. Thus, go to: Storage> Bucket> Click on your file> Click on EDIT PERMISSIONS (upper mid screen, next to DOWNLOADS)> Add item. Then, select the entity you want to add such as : Project, Domain, Group, User, fill in the name (email id, project, service account). Link for "How to use ACLs" from Google.
Upvotes: 1