Pievis
Pievis

Reputation: 1994

App engine aborted request error during tasks burst

I've a question regarding a task failure message I get on App Engine while handling a large quantity of tasks.

The error is the following:

Request was aborted after waiting too long to attempt to service your request.

And my service is configured as below:

<threadsafe>false</threadsafe>
<runtime>java8</runtime>

<system-properties>
    <property name="appengine.api.urlfetch.defaultDeadline" value="${urlfetch.deadline.override}"/>
    <property name="java.util.logging.config.file" value="WEB-INF/logging.properties"/>
</system-properties>

<instance-class>F2</instance-class>
<automatic-scaling></automatic-scaling>

In my code I want to run a concurrent operation, so I launch many tasks to run concurrently. Problem is, from time to time I might get the shown error because there are not so many instances ready to handle my call and when a request is outliving the maximum time it can stay in a queue it just dies.

Do you have any advice on how to handle this situation ? Setting an high value to min-pending-latency can have a positive effect in this matter ?

Thank you for you help :)

Upvotes: 1

Views: 551

Answers (1)

Dan Cornilescu
Dan Cornilescu

Reputation: 39824

Japa apps tend to have a longer instance startup time, which probably plays a big role in your scenario.

Things to consider:

  • check your queue configuration, make sure it's not the one causing the bottleneck in task processing
  • enable multithreading (with <threadsafe>true</threadsafe> in your config), if your application can tolerate that (not always possible). Or, if just this particular task handler is/can be made threadsafe, maybe pull it in a separate service and make that multithreaded. This will allow one instance to handle multiple tasks at the same time, reducing the number of instances needed and thus lowering the instance startup time impact
  • enable/increase the number of standby/resident/idle instances (using the min-idle-instances scaling configuration element) - these instances are designed to handle the temporary request peaks until GAE spins up new dynamic instances to handle traffic increases (which takes some time, including the instance startup time), see also Why do more requests go to new (dynamic) instances than to resident instance?
  • stagger your tasks in time to avoid too abrupt task peaks, using TaskOptions.countdownMillis(long)/TaskOptions.etaMillis(long) or by Using the DeferredTasks instead of a worker service, which reduces the effective time considered to be spent in the queue by those tasks, giving GAE a chance to start enough instances to handle them

Upvotes: 1

Related Questions