How can I use wildcards in my gcp bucket objects path?

Question

My main problem is, I want to check if an object in gcp exists or not. So, what I tried

from google.cloud import storage
client = storage.Client()
path_exists = False
for blob in client.list_blobs('models', prefix='trainedModels/mddeep256_sarim'):
    path_exists = True
    break

It worked fine for me. But now the problem is I don't know the model name which is mddeep256 but I know further part _sarim

So, I want to use something like

for blob in client.list_blobs('models', prefix='trainedModels/*_sarim'):

I want to use * wildcard, how can I do that?

Gaurang Shah · Accepted Answer

list_blob doesn't support regex in prefix. you need filter by yourself as mentioned by Guilaume.

following should work.

def is_object_exist(bucket_name, object_pattern):
    from google.cloud import storage
    import re
    client = storage.Client()
    all_blobs = client.list_blobs(bucket_name)
    regex = re.compile(r'{}'.format(object_pattern))
    filtered_blobs = [b for b in all_blobs if regex.match(b.name)]
    return True if len(filtered_blobs) else False

How can I use wildcards in my gcp bucket objects path?

Answers (2)

Related Questions