Reputation: 341
If I use the following code, I get a list of students that study course 1 and course 2. (This is almost what I want.)
IQueryable<Student> filteredStudents = context.Students;
filteredStudents = filteredStudents
.Where(s => s.Courses.Select(c => c.CourseID).Contains(1));
filteredStudents = filteredStudents
.Where(s => s.Courses.Select(c => c.CourseID).Contains(2));
List<Student> studentList = filteredStudents.ToList<Student>();
However, if I try and do this in a foreach loop (as shown in the following code) then I get a list of all students that are signed up for the last course in the loop.
IQueryable<Student> filteredStudents = context.Students;
foreach (Course course in filter.Courses) {
if (course != null) {
filteredStudents = filteredStudents
.Where(s => s.Courses.Select(c => c.CourseID).Contains(course.CourseID));
}
}
List<Student> studentList = filteredStudents.ToList<Student>();
This behaviour confuses me. Can anyone explain why this is happening? And how to get around it? Thank you.
Upvotes: 3
Views: 446
Reputation: 415735
The problem is that the foreach loop only creates a single course
variable for all of the loop iterations, and this single variable is then captured into a closure. Also remember that the filters aren't actually executed until after the loop. Put those together, and by the time the filters execute this single course
variable has advanced to the last item in the Courses filter; you only check against that one last course.
I see four ways to fix the problem.
Create a new variable for each iteration of the loop (this is probably your best quick fix)
IQueryable<Student> filteredStudents = context.Students;
foreach (Course course in filter.Courses) {
if (course != null) {
int CourseID = course.CourseID;
filteredStudents = filteredStudents
.Where(s => s.Courses.Select(c => c.CourseID).Contains(CourseID));
}
}
List<Student> studentList = filteredStudents.ToList<Student>();
Resolve the IEnumerable expression inside the loop (probably much less efficient):
IEnumerable<Student> filteredStudents = context.Students;
foreach (Course course in filter.Courses) {
if (course != null) {
filteredStudents = filteredStudents
.Where(s => s.Courses.Select(c => c.CourseID).Contains(course.CourseID))
.ToList();
}
}
List<Student> studentList = filteredStudents.ToList<Student>();
Use more appropriate linq operators/lambda expressions to eliminate the foreach loop:
var studentList = context.Students.Where(s => s.Courses.Select(c => c.CourseID).Intersect(filter.Courses.Select(c => c.CourseID)).Any()).ToList();
Or in a more readable way:
IQueryable<Student> filteredStudents = context.Students;
var courses = filter.Courses.Select(c => c.CourseID);
var studentList = filteredStudents
.Where(s => s.Courses.Select(c => c.CourseID)
.Intersect(courses)
.Any()
).ToList();
If you play with this a bit, the performance should meet or far exceed the foreach loop through clever internal use of HashSets or — if you're very lucky — by sending a JOIN query to the DB). Just be careful, because it's easy to write something here that make numerous "extra" calls out to the DB inside that Intersect()
or Any()
method. Even so, this is the option I tend to prefer, with the exception that I probably wouldn't bother to call .ToList()
at the end.
This also illustrates why I don't have much use for ORMs like Entity Framework, linq-to-sql, or even NHibernate or ActiveRecord. If I'm just writing SQL, I can know I'm getting the correct join query. I could do that with an ORM, too, but now I still need to know about the specific SQL I'm creating, and I also have to know how to get the ORM to do what I want.
Use C# 5.0. This is fixed in the most recent version of C#, so that each iteration of a for/foreach loop is its own variable.
Upvotes: 4
Reputation: 4997
If you're trying to get every Student
that is enrolled in every course in filter.Courses
, you could try:
var courseIDs = filter.Courses.Select(c => c.CourseID);
var filteredStudents = context.Students
.Where(s => !courseIDs.Except(s.Courses.Select(c => c.CourseId)).Any())
which filters on whether courseIDs
is a subset of a Student
's course IDs.
Edit
Joel Coehoorn and Mikael Eliasson give a good explanation for why all of the students in the last course were retrieved.
Upvotes: 1
Reputation: 5227
I don't think this has to do with Entity Framework. It's a bug (not really, but a stupid design in c#) where the variable is declared outside the loop.
In this case it means that because the IEnumerable is lazily evaluated it will use the LAST value of the variable. Use a temp variable in the loop to solve it.
foreach (Course course in filter.Courses) {
if (course != null) {
var cId = course.CourseID;
filteredStudents = filteredStudents
.Where(s => s.Courses.Select(c => c.CourseID).Contains(cId))
.Select(s => s);
}
}
Or even better if you have the navigation properties properly defined. Just do:
var studentList = filter.Courses.SelectMany(c => c.Students).ToList()
See more info here: Is there a reason for C#'s reuse of the variable in a foreach?
Upvotes: 0
Reputation: 134
Because "filteredStudents = filteredStudents.Where..." is a straight assignment to the variable, each time through the loop you are completely replacing what was in it before. you need to append to it, not replace. Try a search for "c# AddRange"
Upvotes: 0