Reputation: 5299
I have a list with ids:
var myList = new List<int>();
I want to select all objects from db with ids from myList:
var objList= myContext.MyObjects.Where(t => myList.Contains(t.Id)).ToList();
But when myList.Count > 8000
i get an error:
The query processor ran out of internal resources and could not produce a query plan. This is a rare event and only expected for extremely complex queries or queries that reference a very large number of tables or partitions. Please simplify the query. If you believe you have received this message in error, contact Customer Support Services for more information.
I think that it's because i used Contains()
. What can I use instead of Contains?
Upvotes: 16
Views: 3483
Reputation: 1116
If your list of IDs come from db, then, instead of using List< int > keep it IQueryable< int > because this will not form sql query with IN clause having thousands of arguments. The IN clause will now have subquery. I am surprised to see nobody has mentioned it yet.
IQueryable< int > myList = myContext.Obj1.Where(...).Select(x => x.Id);
var objList = myContext.MyObjects.Where(t => myList.Contains(t.Id)).ToList();
If not this, then, I'll suggest to create an extension method to process this in chunks. You can define your own chunk size.
Something like this:
public static class QueryableExtensions
{
public static List<T1> WhereContains<T1, T2>(this IQueryable<T1> set, List<T2> values, string property)
{
int chunkSize = 5000;
int currentChunk = 1;
List<T1> results = new List<T1>();
int valuesLeft = values.Count;
while (valuesLeft > 0)
{
List<T2> currentValues = values.Skip((currentChunk - 1) * chunkSize).Take(chunkSize).ToList();
results.AddRange(set.Where($"@0.Contains(outerIt.{property})", new object[] { currentValues }).ToList());
valuesLeft -= chunkSize;
currentChunk++;
}
return results;
}
}
Hope it helps!
Upvotes: 0
Reputation: 183
Why not try
var objList= from obj in myContext.MyObjects
join myId in myList on obj.Id equals myId
select obj;
Upvotes: 0
Reputation: 35746
If you wan't this to perform well I'd suggest you use table valued parameters and a stored procedure.
in your database, using TSQL,
CREATE TYPE [dbo].[IdSet] AS TABLE
(
[Id] INT
);
GO
CREATE PROCEDURE [dbo].[Get<table>]
@ids [dbo].[IdSet] READONLY
AS
SET NOCOUNT ON;
SELECT
<Column List>
FROM
[dbo].[<table>] [T]
WHERE
[T].[Id] IN (SELECT [Id] FROM @ids);
RETURN 0;
GO
Then, in C#
var ids = new DataTable()
ids.Columns.Add("Id", typeof(int));
foreach (var id in myList)
{
ids.Rows.Add(id);
}
var objList = myContext.SqlQuery<<entity>>(
"[dbo].[Get<table>] @ids",
new SqlParameter("@ids", SqDbType.Structured)
{
Value = ids,
TypeName = "[dbo].[IdSet]"
}));
Upvotes: 8
Reputation: 106936
You can perform the query on the client side by adding AsEnumerable()
to "hide" the Where
clause from Entity Framework:
var objList = myContext
.MyObjects
.AsEnumerable()
.Where(t => myList.Contains(t.Id))
.ToList();
To improve performance you can replace the list with a HashSet
:
var myHashSet = new HashSet<int>(myList);
and then modify the predicate in Where
accordingly:
.Where(t => myHashSet.Contains(t.Id))
This is the "easy" solution in terms of time to implement. However, because the query is running client side you may get poor performance because all MyObjects
rows are pulled to the client side before they are filtered.
The reason you get the error is because Entity Framework converts you query into something like this:
SELECT ...
FROM ...
WHERE column IN (ID1, ID2, ... , ID8000)
So bascially all 8000 ID's from the list is included in the generated SQL which exceeds the limit of what SQL Server can handle.
What Entity Framework "looks for" to generate this SQL is ICollection<T>
which is implemented by both List<T>
and HashSet<T>
so if you try to keep the query on the server side you get no improved performance by using HashSet<T>
. However, on the client side the story is different where Contains
is O(1)
for HashSet<T>
and O(N)
for List<T>
.
Upvotes: 14
Reputation: 57302
You could split the list in several sub-lists, and run separate queries:
int start = 0;
int count = 0;
const int chunk_size = 1000;
do {
count = Math.Min(chunk_size, myList.Count - start);
var tmpList = myList.GetRange(start, count);
// run query with tmpList
var objList= myContext.MyObjects.Where(t => tmpList.Contains(t.Id)).ToList();
// do something with results...
start += count;
} while (start < myList.Count);
Of course, you need to find out the good "chunk size" in some way that works for you. Depending on the size of the table and of the list, it might be more convenient to load the entire table and filter in the code, as suggested in other answers.
Upvotes: 5
Reputation: 9782
You could create a temporary database table which represents myList
and refactor your query to a JOIN
with that temporary List.
The reason for the error is that the actual query produced contains all elements of myList
.
Basically the DB (the query processor) needs to see both lists to do the filtering. If the second list is too large to fit inside the query you have to provide it otherwise (for example as a temp table)
Upvotes: 5