Rogier Lommers
Rogier Lommers

Reputation: 2443

GCP cloud functions, how to trigger everything in a bucket?

I'm playing around with Google cloud functions. My first conclusion: they are really perfect! I created a function which gets triggered by a modification on the document which is stored in a bucket (or a new upload). This works fine.

But then I started to think: what if I want to trigger all files inside the buckets to be executed against a NEW function. The previous functions are already ran against all files, so I prefer to only run the NEW function agains all documents.

How do you guys do this? So basically my questions are:

Upvotes: 0

Views: 1959

Answers (1)

John Hanley
John Hanley

Reputation: 81346

How do you keep track of what functions are already applied to the files?

Cloud Functions triggers on events. Once the event fires, a Cloud Function is called (if setup to do so). Nothing within GCP keeps track of this except forStackDriver. Your functions will need to keep track of their actions including being triggered for which object.

How do you trigger all files to re-apply all functions?

There is no command or feature to trigger a function for all files. You will need to implement this feature yourself.

How do you trigger all files for just ONE (new) function?

There is no command or feature to trigger a function for a new function. You will need to implement this feature yourself.

Depending on the architecture that you are trying to implement, most people use a database such as Cloud Datastore to track objects within a bucket, transformations that occur and results.

Using a database will allow you to accomplish your goals, but with some effort.

Keep in mind that Cloud Functions has a timeout after running for 540 seconds. This means if you have millions of files, you will need to implement an overlapping strategy for procssing that many objects.

For cases where I need to process millions of objects, I usually launch App Engine Flexible or Compute Engine to complete large tasks and then shutdown once completed. The primary reason is very high bandwidth to Google Storage and Datastore.

Upvotes: 3

Related Questions