Paul Beusterien
Paul Beusterien

Reputation: 29572

ajax, long polling and managing two minute retries

One of the AJAX requests to my node.js server can occasionally take more than two minutes. I've discovered that when the server takes longer than two minutes, the client resends the AJAX request. This caused the server to get even more bogged down since it started a second expensive process.

To get around that, I implemented a long-polling solution on the server. The client makes an ajax call to a check function on the server that just checks if the process is completed and rechecks every five seconds and gets back to the client when it is done.

However, I still have a variation of the two-minute problem. A second check AJAX still call comes in after two minutes. Then both checks are running and it seems that only the new one will communicate back to the client.

What is the best way to address this?

I'm using jQuery AJAX calls, a node.js server on the Chrome browser

UPDATE: From the node.js docs, "all incoming connections have a default timeout of 2 minutes". I'm still interested in suggestions about best practice for coding long running server requests where the client doesn't need to know anything until the server finishes.

Upvotes: 3

Views: 2364

Answers (1)

Dobes Vandermeer
Dobes Vandermeer

Reputation: 8810

To be clear about what is happening:

  1. Client sends request "do this thing for me ..."
  2. Node code starts an asynchronous operation (or several chained together)
  3. Two minutes pass
  4. HTTP request times out
  5. Client re-sends request
  6. Now there are two requests running
  7. And so on until there are a ton of requests running

I think I've heard this called "the dogpile effect". I don't know of any standard libraries in node.js or jQuery to help you although someone had started work on a caching proxy to help with this in the case where the response would be "cacheable":

https://github.com/simonw/dogproxy

So, you'd have to design your own system to get around this.

There are different ways to handle batch jobs and often it depends on the nature of the job.

What I've seen most commonly for tasks that take a long time is the API returns an ID for the job right away and then then client uses polling (or long polling) to wait for the job to complete. You'll need some kind of database to store the state (% complete) and result of the job and allows the client to wait for that status to change. This can also allow you to "cancel" a job, although in that case the server-side code has to check periodically to see if it has been cancelled.

Just be aware that wherever you store the job status has to be shared by all nodes in a cluster if the application is or will be scaled up later. You might use some kind of high speed lightweight storage system like memcached or redis for this.

Upvotes: 5

Related Questions