What's the most scalable and high performing Amazon Web Service (AWS) configuration for a RESTful web service?

Question

I'm building an asynchronous RESTful web service and I'm trying to figure out what the most scalable and high performing solution is. Originally, I planned to use the FriendFeed configuration, using one machine running nginx to host static content, act as a load balancer, and act as a reverse proxy to four machines running the Tornado web server for dynamic content. It's recommended to run nginx on a quad-core machine and each Tornado server on a single core machine. Amazon Web Services (AWS) seems to be the most economical and flexible hosting provider, so here are my questions:

1a.) On AWS, I can only find c1.medium (dual core CPU and 1.7 GB memory) instance types. So does this mean I should have one nginx instance running on c1.medium and two Tornado servers on m1.small (single core CPU and 1.7 GB memory) instances?

1b.) If I needed to scale up, how would I chain these three instances to another three instances in the same configuration?

2a.) It makes more sense to host static content in an S3 bucket. Would nginx still be hosting these files?

2b.) If not, would performance suffer from not having nginx host them?

2c.) If nginx won't be hosting the static content, it's really only acting as a load balancer. There's a great paper here that compares the performance of different cloud configurations, and says this about load balancers: "Both HaProxy and Nginx forward traffic at layer 7, so they are less scalable because of SSL termination and SSL renegotiation. In comparison, Rock forwards traffic at layer 4 without the SSL processing overhead." Would you recommend replacing nginx as a load balancer by one that operates on layer 4, or is Amazon's Elastic Load Balancer sufficiently high performing?

Zimbabao · Accepted Answer

1a) Nginx is asynchronous server (event based), with single worker itself they can handle lots of simultaneous connection (max_clients = worker_processes * worker_connections/4 ref) and still perform well. I myself tested around 20K simultaneous connection on c1.medium kind of box (not in aws). Here you set workers to two (one for each cpu) and run 4 backend (you can even test with more to see where it breaks). Only if this gives you more problem then go for one more similar setups and chain them via an elastic load balancer

1b) As said in (1a) use elastic load balancer. See somebody tested ELB for 20K reqs/sec and this is not the limit as he gave up as they lost interest.

2a) Host static content in cloudfront, its CDN and meant for exactly this (Cheaper and faster then S3, and it can pull content from s3 bucket or your own server). Its highly scalable.

2b) Obviously with nginx serving static files, it will now have to serve more requests to same number of users. Taking that load away will reduce work of accepting connections and sending the files across (less bandwidth usage).

2c). Avoiding nginx altogether looks good solution (one less middle man). Elastic Load balancer will handle SSL termination and reduce SSL load on your backend servers (This will improve performance of backends). From above experiments it showed around 20K and since its elastic it should stretch more then software LB (See this nice document on its working)

What's the most scalable and high performing Amazon Web Service (AWS) configuration for a RESTful web service?

Answers (1)

Related Questions

What&#39;s the most scalable and high performing Amazon Web Service (AWS) configuration for a RESTful web service?

Answers (1)

Related Questions

What's the most scalable and high performing Amazon Web Service (AWS) configuration for a RESTful web service?