Robert
Robert

Reputation: 3543

Entity Framework - slow query after adding group by

I have a following query which runs very fast:

var query =
    (from art in ctx.Articles

     join phot in ctx.ArticlePhotos on art.Id equals phot.ArticleId
     join artCat in ctx.ArticleCategories on art.Id equals artCat.ArticleId
     join cat in ctx.Categories on artCat.CategoryId equals cat.Id
     where art.Active && art.ArticleCategories.Any(c => c.Category.MaterializedPath.StartsWith(categoryPath))
     orderby art.PublishDate descending
     select new ArticleSmallResponse
     {
         Id = art.Id,
         Title = art.Title,
         Active = art.Active,
         PublishDate = art.PublishDate ?? art.CreateDate,
         MainImage = phot.RelativePath,
         RootCategory = art.Category.Name,
         Summary = art.Summary
     })
     .AsNoTracking().Take(request.Take);

However, if I add group by and change query to following statement, it runs much much slower.

var query =
    (from art in ctx.Articles

     join phot in ctx.ArticlePhotos on art.Id equals phot.ArticleId
     join artCat in ctx.ArticleCategories on art.Id equals artCat.ArticleId
     join cat in ctx.Categories on artCat.CategoryId equals cat.Id
     where art.Active && art.ArticleCategories.Any(c => c.Category.MaterializedPath.StartsWith(categoryPath))
     orderby art.PublishDate descending
     select new ArticleSmallResponse
     {
         Id = art.Id,
         Title = art.Title,
         Active = art.Active,
         PublishDate = art.PublishDate ?? art.CreateDate,
         MainImage = phot.RelativePath,
         RootCategory = art.Category.Name,
         Summary = art.Summary
     })
     .GroupBy(m => m.Id)
     .Select(m => m.FirstOrDefault())
     .AsNoTracking().Take(request.Take);

Homepage calls query 9 times for each category. With the first version of query, without caching turned on and connecting to SQL remotely, page load is around 1.5 seconds, which makes it almost instant when application is on server, but second way makes homepage load around 39 seconds when SQL is remotely.

Can it be fixed without rewriting the entire query in to the view or stored procedure?

Upvotes: 1

Views: 3028

Answers (2)

Max
Max

Reputation: 1548

It would be always advisable to add an Index to the GroupBy column. Also, add indexes for columns that are being used in the where clause.

Upvotes: 0

AngryHacker
AngryHacker

Reputation: 61636

Grouping is an expensive operation on the database end. Without knowing what your database looks like and what indexes you've setup, it will be difficult to determine. Why not just group on the client side after the data has arrived (assuming its not an overwhelming amount).

This question explains how. Group by in LINQ

Upvotes: 1

Related Questions