Reputation: 397

Why does response time go up when the number of concurrent requests per second to my Asp.Net Core API goes up

I'm testing an endpoint under load. For 1 request per second, the average response time is around 200ms. The endpoint does a few DB lookups (all read) that are pretty fast and it's async throughout.

However when doing a few hundred requests per second (req/sec), the average response time goes up to over a second.

I've had a look at the best practices guide at:

https://learn.microsoft.com/en-us/aspnet/core/performance/performance-best-practices?view=aspnetcore-2.2

Some suggestions like "Avoid blocking calls" and "Minimize large object allocations" seem like they don't apply since I'm already using async throughout and my response size for a single request is less than 50 KB.

There are a couple though that seem like they might be useful, for example:

https://learn.microsoft.com/en-us/ef/core/what-is-new/ef-core-2.0#high-performance https://learn.microsoft.com/en-us/aspnet/core/performance/performance-best-practices?view=aspnetcore-2.2#pool-http-connections-with-httpclientfactory

Questions:

Why would the average response time go up with an increased req/sec?
Are the suggestions above that I've marked as being 'might be useful' likely to help? I ask because while I would like to try out all, I have limited time available to me unfortunately, so I'd like to try out options that are most likely to help first.
Are there any other options worth considering?

I've had a look at these two existing threads, but neither answer my question:

Correlation between requests per second and response time?

ASP.NET Web API - Handle more requests per second

Upvotes: 2

Answers (1)

Steve Py

Reputation: 34793

It will be hard to answer your specific issue without access to the code but the main things to consider is the size and complexity of the database queries being generated by EF. Using async/await will increase the responsiveness of your web server to start requests, but the request handling time under load will depend largely on the queries being run as the database becomes the contention point. You will want to ensure that all queries are as minimalist as possible. For example, there is a huge difference between the following 3 statements:

var someData = context.SomeTable.Include(x => x.SomeOtherTable)
    .ToList()
    .Where(x => x.SomeCriteriaMethod())
    .ToList();

var someData = context.SomeTable.Include(x => x.SomeOtherTable)
    .Where(x => x.SomeField == someField && x.SomeOtherTable.SomeOtherField == someOtherField)
    .ToList();

var someData = context.SomeTable
    .Where(x => x.SomeField == someField && x.SomeOtherTable.SomeOtherField == someOtherField)
    .Select(x => new SomeViewModel 
    {
       SomeTableId = x.SomeTableId,
       SomeField = x.SomeField,
       SomeOtherField = x.SomeOtherTable.SomeOtherField
    }).ToList();

Examples like the first above are extremely inefficient as they end up loading all data from the related tables from the database before filtering rows. Even though your web server may only pass back a few rows, it has requested everything from the database. These kinds of scenarios creep into apps when developers face scenarios where they want to filter on a value that EF cannot translate to SQL (such as a Function) so they solve it by putting a ToList call, or it can be introduced as a byproduct of poor separation such as a repository pattern that returns an IEnumerable.

The second example is a little better where they avoid using the read-all ToList() call, but the calls are still loading back entire rows for data that isn't necessary. This ties up resources on the database and web servers.

The third example demonstrates refining queries to just return the absolute minimum of data that the consumer needs. This can make better use of indexes and execution plans on the database server.

Other performance pitfalls that you can face under load are things like lazy loads. Databases will execute a finite number of concurrent requests, so if it turns out some queries are kicking off additional lazy load requests, when there is no load, these are executed immediately. Under load though, they are queued up along with other queries and lazy load requests which can tie down data pulls.

Ultimately you should run an SQL profiler against your database to capture the kinds and numbers of SQL queries being executed. When executing in a test environment, pay close attention to the Read count and CPU cost rather than the total execution time. As a general rule of thumb higher read and CPU cost queries will be far more susceptible to execution time blow-out under load. They require more resources to run, and "touch" more rows meaning more waiting for row/table locks.

Another thing to watch out for are "heavy" queries in very large data systems that will need to touch a lot of rows, such as reports, and in some cases, highly customizable search queries. If these should be needed, you should consider planning your database design to include a read-only replica to run reports or large search expressions against to avoid row lock scenarios in your primary database that can degrade responsiveness for the typical read and write queries.

Edit: Identifying lazy load queries.

These show up in a profiler where you query against a top level table, but then see a number of additional queries for related tables following it.

For example, say you have a table called Order, with a related table called Product, another called Customer and another called Address for a delivery address. To read all orders for a date range you'd expect to see a query such as:

SELECT [OrderId], [Quantity], [OrderDate] [ProductId], [CustomerId], [DeliveryAddressId] FROM [dbo].[Orders] WHERE [OrderDate] >= '2019-01-01' AND [OrderDate] < '2020-01-01'

Where you just wanted to load Orders and return them.

When the serializer iterates over the fields, it finds a Product, Customer, and Address referenced, and by attempting to read those fields, will trip lazy loading resulting in:

SELECT [CustomerId], [Name] FROM [dbo].[Customers] WHERE [CustomerId] = 22
SELECT [ProductId], [Name], [Price] FROM [dbo].[Products] WHERE [ProductId] = 1023
SELECT [AddressId], [StreetNumber], [City], [State], [PostCode] FROM [dbo].[Addresses] WHERE [AddressId] = 1211

If your original query returned 100 Orders, you would see potentially 100x the above set of queries, one set for each order as a lazy load hit on 1 order row would attempt to look up a related customer by customer ID, a related product by Product ID, and a related Address by Delivery Address ID. This can, and does get costly. It may not be visible when run on a test environment, but that adds up to a lot of potential queries.

If eager loaded using .Include() for the related entities, EF will compose JOIN statements to get the related rows all in one hit which is considerably faster than fetching each individual related entity. Still, that can result in pulling a lot of data you don't need. The best way to avoid this extra cost is to leverage projection through Select to retrieve just the columns you need.

Upvotes: 4

Why does response time go up when the number of concurrent requests per second to my Asp.Net Core API goes up

Answers (1)

Related Questions