Myuuiii
Myuuiii

Reputation: 35

Entity Framework, 1 extra include drasticly changes response time

Situation

I'm currently working on a web API in ASP .NET Core. For this I make use of Entity Framework for database related operations. This API returns information about the game Azur Lane. One of my controllers is for retrieving information about "Ships". A ship is a pretty large object with a lot of stuff that needs to be included. Below you will see how I retrieve the set of ships with all the information linked to it. Recently I have added a new property to the so called "Ship" object called "ShipQuotes" which is a list of small objects (Id + 4 string properties). Of course I had to include this as well otherwise the API would return an empty list.

await _context.Ships
                    .Include(s => s.Stars)
                    .Include(s => s.Skins)
                    .Include(s => s.Skills)
                    .Include(s => s.LimitBreaks)
                    .Include(s => s.Gallery)
                    .Include(s => s.EquippableSlots)
                    .Include(s => s.Quotes)
                    .Include(s => s.BaseStats)
                    .Include(s => s.Level100Stats)
                    .Include(s => s.Level120Stats)
                    .Include(s => s.Level100RetrofitStats)
                    .Include(s => s.Level120RetrofitStats)
                    .Include(s => s.EnhanceValue)
                    .Include(s => s.ScrapValue)
                    .Include(s => s.Construction)
                    .Include(s => s.Construction.Availability)
                    .Include(s => s.Artist)
                    .Include(s => s.Pixiv)
                    .Include(s => s.Twitter)
                    .Include(s => s.Web)
                    .Include(s => s.VoiceActor)
                        .SingleAsync(ship => ship.Name == name);

The problem (?)

I'm not sure if this is a problem but I don't know why this might be happening but; after I added the ".Include(s => s.Quotes)" my API's response time dropped drastically. A response is on average 25kb so I don't think it has anything to do with my internet/network (the database is hosted on the same network as the web api)

an example response where the Quotes are also loadedtook about 30 seconds If we exclude the "Quotes" property from the object it would load in under 1 second

On average a ship has 30 quotes that look something like this:

{
      "id": "057404ac-ae41-494f-9825-f4fedbd61cd4",
      "skin": "Default Skin",
      "event": "Task",
      "audioUrl": "---------- EXTERNAL LINK REMOVED FOR STACKOVERFLOW POST ----------",
      "eN_Transcription": "We still have missions that haven't been completed. You should check on their progress.",
      "jP_Transcription": "ミッションがまだ終わっていないぞ。進捗を確認したほうがいい",
      "cN_Transcription": "还有未完成的任务哦,还是确认下进度比较好"
}

Here is the ship model (as seen on swagger)

Ship Model

{
  "shipId": "string",
  "name": "string",
  "rarity": "string",
  "stars": {
    "stars": "string",
    "count": 0
  },
  "nation": "string",
  "type": "string",
  "thumbnailImage": "string",
  "skins": [
    {
      "name": "string",
      "imageUrl": "string",
      "backgroundUrl": "string",
      "chibiUrl": "string",
      "live2dModel": true,
      "obtainedFrom": "string"
    }
  ],
  "skills": [
    {
      "iconUrl": "string",
      "name": "string",
      "description": "string",
      "color": "string"
    }
  ],
  "limitBreaks": [
    {
      "limitBreaks": [
        "string"
      ]
    }
  ],
  "gallery": [
    {
      "description": "string",
      "url": "string"
    }
  ],
  "equippableSlots": [
    {
      "maxEfficiency": 0,
      "minEfficiency": 0,
      "type": "string",
      "max": 0
    }
  ],
  "quotes": [
    {
      "id": "3fa85f64-5717-4562-b3fc-2c963f66afa6",
      "skin": "string",
      "event": "string",
      "audioUrl": "string",
      "eN_Transcription": "string",
      "jP_Transcription": "string",
      "cN_Transcription": "string"
    }
  ],
  "baseStats": {
    "luck": 0,
    "armor": "string",
    "speed": 0,
    "health": 0,
    "firepower": 0,
    "antiAir": 0,
    "torpedo": 0,
    "evasion": 0,
    "aviation": 0,
    "oilConsumption": 0,
    "reload": 0,
    "antiSubmarine": 0,
    "oxygen": 0,
    "ammunition": 0,
    "accuracy": 0,
    "huntingRange": "string"
  },
  "level100Stats": {
    "luck": 0,
    "armor": "string",
    "speed": 0,
    "health": 0,
    "firepower": 0,
    "antiAir": 0,
    "torpedo": 0,
    "evasion": 0,
    "aviation": 0,
    "oilConsumption": 0,
    "reload": 0,
    "antiSubmarine": 0,
    "oxygen": 0,
    "ammunition": 0,
    "accuracy": 0,
    "huntingRange": "string"
  },
  "level100RetrofitStats": {
    "luck": 0,
    "armor": "string",
    "speed": 0,
    "health": 0,
    "firepower": 0,
    "antiAir": 0,
    "torpedo": 0,
    "evasion": 0,
    "aviation": 0,
    "oilConsumption": 0,
    "reload": 0,
    "antiSubmarine": 0,
    "oxygen": 0,
    "ammunition": 0,
    "accuracy": 0,
    "huntingRange": "string"
  },
  "level120Stats": {
    "luck": 0,
    "armor": "string",
    "speed": 0,
    "health": 0,
    "firepower": 0,
    "antiAir": 0,
    "torpedo": 0,
    "evasion": 0,
    "aviation": 0,
    "oilConsumption": 0,
    "reload": 0,
    "antiSubmarine": 0,
    "oxygen": 0,
    "ammunition": 0,
    "accuracy": 0,
    "huntingRange": "string"
  },
  "level120RetrofitStats": {
    "luck": 0,
    "armor": "string",
    "speed": 0,
    "health": 0,
    "firepower": 0,
    "antiAir": 0,
    "torpedo": 0,
    "evasion": 0,
    "aviation": 0,
    "oilConsumption": 0,
    "reload": 0,
    "antiSubmarine": 0,
    "oxygen": 0,
    "ammunition": 0,
    "accuracy": 0,
    "huntingRange": "string"
  },
  "enhanceValue": {
    "firepower": 0,
    "torpedo": 0,
    "aviation": 0,
    "reload": 0
  },
  "scrapValue": {
    "coins": 0,
    "oil": 0,
    "medals": 0
  },
  "construction": {
    "constructionTime": "string",
    "availability": {
      "light": "string",
      "heavy": "string",
      "special": "string",
      "limited": "string",
      "exchange": "string"
    }
  },
  "artist": {
    "name": "string",
    "url": "string"
  },
  "pixiv": {
    "name": "string",
    "url": "string"
  },
  "twitter": {
    "name": "string",
    "url": "string"
  },
  "web": {
    "name": "string",
    "url": "string"
  },
  "voiceActor": {
    "name": "string",
    "url": "string"
  }
}

My Question

What could be the cause of this drastic change? The contents of "Quotes" aren't that big but the database is showing an extreme amount of traffic

Below you can see a screenshot of my database traffic where I requested the object that I included in this post. Sudden bump in database traffic

relations?

db relations

Thank you ❤

[edit -> Last bit of post was unfinished]

Upvotes: 0

Views: 67

Answers (2)

You're essentially selecting from every table in your database and joining the results together EF Core makes doing so easy, but that doesn't mean you should be doing it.

Generally, a select that joins to more than 5 tables is going to cause performance problems, and is thus a symptom of a poorly-designed query and/or a badly-designed database. Looking at your DB schema, it's heavily normalised, possibly overly so - for example, the columns in the shipconstruction through shipweb tables (those on the right-hand side of your DB diagram) could all be pushed down into the ships table. That would make ships much wider in terms of columns and cause some data duplication, but would also eliminate a whopping 10 joins. Joins are the killer, as you've already discovered.

If you are only using a handful of columns from each of those tables, you can write a database view that selects only those columns. It will be more efficient than anything EF generates, but can be somewhat of a pain to maintain.

Finally, what you should look into over and above EF is caching. If the data in those various tables changes infrequently, cache the corresponding entities in-memory so you don't even have to go to the database. You can also quite easily write an abstraction for this that will return an entity from the cache if it exists there; or if not, fetches it from the database, puts it into the cache, and returns it. That way only your first lookup for an entity will be slow.

There is much more to optimising data storage performance than this, and many different ways of slicing and dicing things to balance your needs. Ultimately, the only right answer is what works best for your problem space, and you need to do research and testing to help you figure out that balance. But at least now you have some idea of some of the options.

Upvotes: 0

Panagiotis Kanavos
Panagiotis Kanavos

Reputation: 131722

Include eagerly loads related entities by generating a LEFT JOIN. This leads to row duplication though. For 30 quotes, the rest of the ship data will be repeated 30 times. This is always a problem with large or complex entities but even experienced developers forget about it because such complex entities aren't that common in simple applications. For moderately complex entities, it's possible to map database entities to API DTOs in the Select clause.

Unless you have an ERP, CRM or BOM problem, where all the important entities and queries are at least that complex, where you do want to load 10s of related entities each with dozens of instances for a single master entity.

EF Core 5 added split queries to handle this case. In the documentation example, the following query:

using (var context = new BloggingContext())
{
    var blogs = context.Blogs
        .Include(blog => blog.Posts)
        .ToList();
}

Generates

SELECT [b].[BlogId], [b].[OwnerId], [b].[Rating], [b].[Url], [p].[PostId], [p].[AuthorId], [p].[BlogId], [p].[Content], [p].[Rating], [p].[Title]
FROM [Blogs] AS [b]
LEFT JOIN [Post] AS [p] ON [b].[BlogId] = [p].[BlogId]
ORDER BY [b].[BlogId], [p].[PostId]

Adding AsSplitQuery() splits this to separate queries :

using (var context = new BloggingContext())
{
    var blogs = context.Blogs
        .Include(blog => blog.Posts)
        .AsSplitQuery()
        .ToList();
}

Generates

SELECT [b].[BlogId], [b].[OwnerId], [b].[Rating], [b].[Url]
FROM [Blogs] AS [b]
ORDER BY [b].[BlogId]

SELECT [p].[PostId], [p].[AuthorId], [p].[BlogId], [p].[Content], [p].[Rating], [p].[Title], [b].[BlogId]
FROM [Blogs] AS [b]
INNER JOIN [Post] AS [p] ON [b].[BlogId] = [p].[BlogId]
ORDER BY [b].[BlogId]

Upvotes: 2

Related Questions