Till
Till

Reputation: 1138

Scaling workers for 10M+/day job queue

We'd like to upgrade our PHP-based AWS SQS queue worker architecture because we're handling 10M+ jobs per day and the infrastructure is getting expensive.

Our jobs use hardly any memory, but can run for 5-10 seconds each due to slow HTTP responses.

Could anyone recommend languages, approaches or tools that either support running dozens of workers on the same machine simultaneously, or can execute dozens of jobs simultaneously very efficiently?

Thanks a lot!

Upvotes: 0

Views: 201

Answers (2)

Alister Bulman
Alister Bulman

Reputation: 35169

I've run similar systems, with 30-200+ copies of PHP CLI-based workers across a number of machines. I started them up with Supervisord, where there is a 'numprocs' configuration that says how many of that particular program to start. You can have several such configuration groups.

In terms of optimising also for cost, consider 'spot instances' to reduce the price per hour, per machine. In order to get each system running as quickly as possible, you'll need to ensure that they are pre-setup with all the installed software and configurations they need.

Upvotes: 1

Luc Hendriks
Luc Hendriks

Reputation: 2533

If the bottleneck is in HTTP requests, you should consider using node.js. It makes it very easy to program your code so it runs asynchronously. I assume in your current implementation the HTTP requests are blocking the CPU thread. This is inefficient, because the CPU could do something else while it is waiting for the request to complete and then it can parse the results. This is almost trivial in node.js and the excellent async library.

An asynchronous implementation could speed your program up a factor of 10-100 or maybe even more. Especially if the "waiting for HTTP request" takes a lot more time than actual computations. Use a fleet of micro or nano instances, node.js (or javascript in general) is a single threaded language, so normally you don't need multiple cores.

Another approach would be to attach the SNS service to SQS and set up a Lambda function that parses the ticket. See this page for an intro in AWS Lambda. Maybe you have peak days and low days, then this approach should be more cost-effective. When the load is evenly distributed, AWS Lambda is more expensive than EC2.

Upvotes: 1

Related Questions