Reputation: 21
I am having a problem with a very simple query using EF. The following query takes over 30s to run and throws a timeout error:
var results = _Context.Entity
.Where(x => list1.Contains(x.Id))
.Where(x => !list2.Contains(x.Id))
.ToList();
While the one below runs in a 1/4 of a second:
var results = _Context.Entity
.Where(x => list1.Contains(x.Id))
.ToList();
results = results
.Where(x => !list2.Contains(x.Id))
.ToList();
The collection in the db is about 60k rows, and both my Lists are about 5k int. Is it a caching issue? I have many similar queries in my app that all run smoothly. What could I do to improve it?
Upvotes: 2
Views: 157
Reputation: 827
Your first one is checking running the sql which to where conditions.
SELECT * FROM Entity WHERE id IN () AND id NOT IN ()
Where as the second is only running the check where in query:
SELECT * FROM Entity WHERE id IN ()
The where check on the second statement of the second one:
results = results
.Where(x => !list2.Contains(x.Id))
.ToList();
Is actually getting ran on the webserver and not using sql. As soon as you type .ToList() you execute the SQL Linq.
Anyway the reason the first is slower than the second is because it's comparing every element in Entity to both lists. Where as the second checks to see if the id is in the list and then returns a list of items which meet that criteria. Then the program is using the returned filtered list (most likely fewer rows than the total some of your entity table) to check to make sure that list doesn't contain a row with the id from list 2 in it.
So the second just has fewer comparisons going on.
Upvotes: 0
Reputation: 34189
In case 1 you build a more complex query like this:
SELECT * FROM Entities WHERE Id in (...) AND Id NOT IN (...)
But in case 2 you make a simple query
SELECT * FROM Entities WHERE Id in (...)
and then filter out the results in C#. It would work even faster if your list2
was actually a HashSet
. However, you make SQL server return redundant rows which will be filtered out.
I guess, the first one works slower, because it doesn't utilize index, but I am not sure, I may be wrong.
What is more important, what you actually want is to get all items where ID is in list1
, but not in list2
. In this case, you don't need to make SQL server handle it - just filter out it in code and then pass only those IDs which you need:
var listToSearch = list1.Except(list2).ToArray();
var results = _Context.Entity.Where(x => listToSearch.Contains(x.Id)).ToList();
Upvotes: 1