Reputation: 347
We are starting a new project in our company where we basically run few Python scripts for each client, twice a day. So the idea is, twice a day a Cloud Function will be triggered where the function will trigger the Python script for each client creating new instances of App Engine / Cloud Run or any other serverless service Google's offer.
At the begining we though of using Cloud Functions, but very quickly we found out they are not suited for long running Python scripts, the scripts will eventually calculate and collect different information for each client and write them to Firebase.
The flow of the processes would be: Cloud Function triggered -> function trigger GCP instance for each client -> script running for each client -> out put is being saved to Firebase.
What would be the recommended way to do it without a dedicated server, which GCP serverless services would fit the most?
Upvotes: 3
Views: 2568
Reputation: 6298
You can execute "long" running Google App Engine (GAE) Tasks using Cloud Tasks.
How long (which is why I have it in quotes) depends on the kind of scaling that you are using for your GAE Project Instance. Instances which are set to 'automatic scaling' are limited to a maximum of 10 minutes while instances which are set to 'manual' or 'basic' have up to 24 hours execution time.
From the earlier link
....all workers must send an HTTP response code (200-299) to the Cloud Tasks service, in this instance before a deadline based on the instance scaling type of the service: 10 minutes for automatic scaling or up to 24 hours for manual scaling. If a different response is sent, or no response, the task is retried....
Adding Update (there's seems to be some confusion between 30 mins vs 24 hours)
Standard HTTP Requests have a maximum execution time of 30 minutes (source) while GAE Endpoints can run for up to 24 hours if you're using manual scaling (source)
Upvotes: 1
Reputation: 75745
There is a lot of great answers! The key here is to decouple and to distribute the processing.
When you talk about decoupling you can use Cloud Task (where you can add flow control with rate limit or to postpone a task in the future) or PubSub (more simple message queueing solution).
And Cloud Run is a requirement to run up to 15 minutes processing. But you will have to fine tune it (see below my tips)
So, to summarize the process
However, if your processing is compute intensive, you have to tune Cloud Run. If the processing take 15 minutes for 1 client on 1vCPU, that mean you can't process more than 1 client per CPU if you don't want to reach the timeout (2 clients can lead you to take about 30 minutes for both on the same CPU and you can reach the timeout). For that, I recommend you to set the concurrency parameter of Cloud Run to 1, to process only one request at a time (of course, if you set 2 or 4 CPU on Cloud Run you can also increase the concurrency parameter to 2 or 4 to allow parallel processing on the same instance, but on different CPU).
If the processing is not CPU intensive (you perform API call and you wait the answer) it's harder to say. Try with a concurrency of 5, 10, 30,... and observe the behaviour/latency of the processed requests. No worries, with Cloud Task and PubSUb you can set retry policies in case of timeout.
Last things: is your processing idempotent? I mean, if you run 2 time the same process for the same client, is the result correct or is it a problem? Try to make the solution idempotent to overcome retry issues and globally issues that can happen on distributed computing (including the replays)
Upvotes: 1
Reputation: 2055
@NoCommandLine's answer is a best recommendation and Cloud Run is also a good option if you want to set longer running operations as timeout could be set between 5 minutes (as default) and 60 minutes. You can set or update request timeout through either Cloud Console, command line or YAML.
Meanwhile, execution time for Cloud Function only has 1 minute (by default) and could be set to 9 minutes maximum.
You can check out the full documentation below:
You can also check a related SO question through this link.
Upvotes: 1