Reputation: 28
The query below does what I want, but it is very slow when the two lists have many items (> 300 thousand).
Basically, it returns all people on list 2 who do not have documents in list 1.
personList1.Add(person1);
personList1.Add(person2);
personList2.Add(person2);
personList2.Add(person3);
var result = personList2
.Where(p2 => p2.documents
.Exists(d2 => !personList1
.Exists(p1 => p1.documents
.Contains(d2)
)
)
).ToList();
result.ForEach(r => Console.WriteLine(r.name));
//Should return person3 name
Classes
public class Person
{
public string name { get; set; }
public List<IdentificationDocument> documents { get; set; }
public Person()
{
documents = new List<IdentificationDocument>();
}
}
public class IdentificationDocument
{
public string number { get; set; }
}
Full code
https://dotnetfiddle.net/gS57gV
Anyone know how to improve query performance? Thank you!
Upvotes: 0
Views: 795
Reputation: 77304
Put all relevant data in a structure made for lookup first:
var lookup = new HashSet<string>(personList1.SelectMany(p => p.documents).Select(d => d.number));
var result = personList1.Where(p => !p.documents.Select(d => d.number).Any(lookup.Contains));
Upvotes: 2