Jean-Baptiste Rudant
Jean-Baptiste Rudant

Reputation: 1066

Run a Google Cloud Function for each file in a bucket

I have a Google Cloud Function triggered by a Google Cloud Storage object.finalize event. When I deploy a new version of this function, I would like to run it for every existing file in the bucket (which have already been processed by the previous version of the function). Processing all the existing files in the bucket is a long running task, hence I don't think a Google Cloud Function which will process all files in a row is an option.

The best option I can see for now is to make a Google Cloud Function I can triggered via HTTP that will list all the files in the bucket and publish one event per file via Google PubSub, and then process each of these events with a slightly modified version of my initial Google Cloud Function which accepts a PubSub event in place of the object.finalize storage event.

I think it can work but I was wondering if there was an easier way to perform this operation.

Upvotes: 1

Views: 2380

Answers (2)

Brandon Yarbrough
Brandon Yarbrough

Reputation: 38379

One option might be to write a small program that lists all of the objects in a bucket and, for each object, posts a message to Cloud Pub/Sub that triggers your function in the same way a GCS change would.

Upvotes: 1

Frank van Puffelen
Frank van Puffelen

Reputation: 598847

If the operation you're trying to perform may take longer than the maximum time that a Cloud Function can run, you will need to split that operation into multiple steps. Your approach of using a PubSub trigger for each individual file, sounds like a valid approach to do that for me.

Upvotes: 2

Related Questions