Is really bad RDS-performance expected when having database and web-server seperated?

Question

We're currently trying out RDS and are experiencing some really bad performance. Im unable to figure out where the bottleneck is and would appreciate some guidance!

We have a simple web app, that runs on an IIS-server located in Denmark. Right now our database is also located on this IIS-server. Our web app is rather old fashioned, so before the server responds with a document it makes around 30 database queries to generate the initial HTML-document.

With our current database-setup (located on the prod-server) the document load time is as follows:

localhost: 7 seconds
Dev-server: 0.9 seconds
Prod-server: 0.5 seconds

(Dev and prod server are in the same data center - not aws. All are using the database located on the prod server)

With a T2 Medium rds instance with sql express the load times is as follows:

localhost: 12.22 seconds
Dev-server: 24.76 seconds
Prod-server: 11.49 seconds

Clarification of server configuration Our production database is running on our production server along with the application. This is a single monolithic instance doing everything. All my first tests used the production database. The servers and me/localhost is located in Denmark, while the RDS-instance is located in Frankfurt/DE. So, for the RDS-test I was running the application logic on servers in Denmark, while using a RDS-database in Germany.

The RDS-instance is in Frankfurt. I found that this data center had the smallest ping-time for us.

I briefly tried making a M4-large RDS-instance to check if it were simply a matter of using a poorly spec'ed instance, but i saw exactly the same results.

The database isn't maxed out in anything, there seems to be plenty of cpu-credits and the cpu is hoverering around 1-4%. I've used SQL Server Profiler to check if there were a problem with the queries (it could be something like indexing not working properly with the limited ram or just slow reads or something) but each individual query is quite fast - though they are quite far apart.

I've tried changing the storage type from general purpose to provisioned but found no change.

Can someone try and help me find the bottle neck? Or is this simply expected due to network-latency? I was expecting a network penalty by moving the database further away from the application but not this much. Is this setup even doable - or would it only work if we had the IIS-server in the same availability zone?

MisterSmith · Accepted Answer

Based on your comments as well as your original question i think your application does not lend itself well to a high latency connection between the application and database. DirectConnect might be an improvement but it will always be a bottleneck. I would strongly advise against trying to run a production app with a remote databases unless there were very strong motivation to do so. In an ideal world Id advise you look at parallelisation, caching and optimisation - but if your stuck with something you just need hosting, this is how i would do it.

CloudFront CDN, an autoscale group with some ec2 instances, a load balencer with an ACM certificate, and a multi az rds instance. All of the services will be local to each other and I suspect your application will perform much better. This configuration will offer geographically seperate locations running copies of both IIS and SQL for fault tolerance. A basic rundown:

CloudFront CDN. This is where you point your DNS. Use cache headers from IIS to have static assets cached at edge locations accelerating page load / decrease load on backend IIS. Dynamic requests will pass-through CloudFront to the load balencer and then served back to the client transparently.
Autoscale Group. Manages set of instances split between availability zones attached to a load balencer. Provides trigger based/scheduled scaling number of instances and can respawn unhealthy instances. Instances can be pre-configured and saved as AMI's, or can be configured dynamically via the UserData script at startup.
Load Balencer (ALB). Regional level load balencer will handle failure of an aws availability zone routing traffic to remaining reachable instances. You can configure SSL offloading at the load balencer, so the instances <--> ALB communicate via HTTP, but the public only talk to the load balencer over HTTPS. You can also setup your own SSL on the instances and have the ALB talk to the backend instances via HTTPS as well but its more work.
ACM certificate (certificate manager). Its a basic DNS validated SSL cert. You cant download it and use it yourself, its only available to CloudFront and Load Balencers etc - but its free! You can still upload your own certificate / intermediate certs / primary key if they are converted to the correct format and attach to a load balencer.
Multi AZ RDS. You get a single regional/fault tolerant endpoint and AWS take care of most of the details. Effectively you have 2 servers in different AZ's where an active instance replicates data to a separate standby slave instance. If the master becomes unavailable the slave takes over transparently.

In terms of putting this live, a few things to check. Make sure your SSL certificate covers the site name (www.example.com) and the bare/apex domain (example.com), and that IIS handles redirecting traffic to the appropriate site. Also make sure your DNS TTL has been reduced to a suitably small value to allow reasonably responsive changes (and rollback if theres a major issue). You need to allow the current TTL to expire before any changes to the TTL come fully into effect, so plan ahead (and possibly change the TTL in steps if the value is very high).

Is really bad RDS-performance expected when having database and web-server seperated?

Answers (2)

Related Questions