Reputation: 5771
I'm having performance problem with Entity Framework while using generic methods for data aggregation. When querying max value of (indexed) Id column of table with few hundred thousands of rows using generic methods I'm getting huge performance drop. I'm using code-generated int keys instead of sql identities, this code is used to get next new Id.
Here's sample illustration. MaxTyped
uses Max
on db.Posts
(which is DbSet<Post>
), while MaxGeneric
uses generic method max
to do the same.
static int MaxTyped()
{
using (var db = new BloggingContext())
{
return db.Posts.Max(p => p.PostId);
}
}
static int MaxGeneric()
{
using (var db = new BloggingContext())
{
return max(db.Posts, p => p.PostId);
}
}
static int max<T>(DbSet<T> set, Func<T, int> func) where T : class
{
// intellisense says its IEnumerable.Max
return set.Max(func);
}
On my not so old home desktop with quite fast hdd, MaxTyped
on 100k rows runs in 0.5s, while MaxGeneric
around 6.5s. This is one order of magnitude slower.
On my office old testing server we're having times of few seconds vs almost 10 minutes.
The only trace of the problem I've found is the difference in Intellisense's output for Max method in both cases: in MaxTyped
it identifies method as IQueryable.Max
, while in max used by MaxGeneric
it says its IEnumerable.Max
, what might mean EF is doing Max on all loaded entities and not in database. I tried casting set
to IQueryable<T>
, but nothing changed. Why there's such a difference? How to fix it?
Upvotes: 0
Views: 132
Reputation: 111910
Because the MaxTyped
is executed "server side", so only a single row is returned, while MaxGeneric
is executed "client side", so all the rows of the table are returned, "deserialized" to int
and the "max(ed)" (this last step is the fastest one)
The real cost is in the passage of data between the SQL Server and the .NET application: less data has to be passed, the better it's.
Now try
static int max<T>(DbSet<T> set, Expression<Func<T, int>> func) where T : class
{
// intellisense says its IEnumerable.Max
return set.Max(func);
}
and see if set.Max
is using the IEnumerable<T>.Max()
or the IQueriable<T>.Max()
. If it's using the second one, its speed should be around 0.5 secs.
The difference is that IQueriable<T>.Max()
accepts only Expression<Func<T, int>>
, while IEnumerable<T>.Max()
accepts Func<T, int>
, and when you write p => p.PostId
this can be an Expression<Func<>>
or a Func<>
depending on the type of variable it's put in.
Upvotes: 3