Rafael Oliveira
Rafael Oliveira

Reputation: 28

How to improve the performance of a LINQ query that compares two lists?

The query below does what I want, but it is very slow when the two lists have many items (> 300 thousand).

Basically, it returns all people on list 2 who do not have documents in list 1.

        personList1.Add(person1);
        personList1.Add(person2);

        personList2.Add(person2);
        personList2.Add(person3);

        var result = personList2
                    .Where(p2 => p2.documents
                        .Exists(d2 => !personList1
                            .Exists(p1 => p1.documents
                                .Contains(d2)
                            )
                        )
                    ).ToList();

        result.ForEach(r => Console.WriteLine(r.name));
        //Should return person3 name

Classes

public class Person
{
    public string name { get; set; }
    public List<IdentificationDocument> documents { get; set; }

    public Person()
    {
        documents = new List<IdentificationDocument>();
    }
}

public class IdentificationDocument
{
    public string number { get; set; }
}

Full code

https://dotnetfiddle.net/gS57gV

Anyone know how to improve query performance? Thank you!

Upvotes: 0

Views: 795

Answers (1)

nvoigt
nvoigt

Reputation: 77304

Put all relevant data in a structure made for lookup first:

var lookup = new HashSet<string>(personList1.SelectMany(p => p.documents).Select(d => d.number));

var result = personList1.Where(p => !p.documents.Select(d => d.number).Any(lookup.Contains));

Upvotes: 2

Related Questions