Mariusz Miesiak
Mariusz Miesiak

Reputation: 638

Remove duplicates from generic list<T>

I have came with solution to remove duplicates from generic list<T> in .NET 2.0 as follows:

List<CaseStudy> caseStudies = CaseStudyDAO.FindCaseStudiesByDate(DateTime.Now.Date, DateTime.Now.Date.AddDays(1));
caseStudies.RemoveAll(
        delegate(CaseStudy c)
        {
            return caseStudies.IndexOf(c) != caseStudies.FindIndex(
                delegate(CaseStudy f) { return c.Str == f.Str; });
        });

My questions are:

Is there more efficient way to do this? Only .NET 2.0 solution
What is the complexity of the above solution?

Thanks,
jan2k10

Upvotes: 2

Views: 1861

Answers (2)

Jon Skeet
Jon Skeet

Reputation: 1500385

Just to expand on Eric's comment about O(n) time if you're happy to use more space, I'd do something like this:

Dictionary<string, CaseStudy> lookup = new Dictionary<string, CaseStudy>();
foreach (CaseStudy cs in caseStudies)
{
    lookup[cs.Str] = cs;
}
caseStudies = new List<CaseStudy>(lookup.Values);

A couple of notes:

  • This changes the value of caseStudies to refer to a new list. If you wanted it to be within the same List<T>, you could use:

    caseStudies.Clear();
    caseStudies.AddRange(lookup.Values);
    
  • This keeps the last element in the list with each distinct Str value. That was just to make it as short as possible. If you want the first element, use:

    foreach (CaseStudy cs in caseStudies)
    {
        if (!lookup.ContainsKey(cs.Str))
        {
            lookup[cs.Str] = cs;
        }
    }
    

Upvotes: 5

Eric Lippert
Eric Lippert

Reputation: 660024

The time complexity of RemoveAll is O(n). The time complexity of the indexing is O(n), so that's a grand total of O(n^2) time complexity. The space complexity is, I think, O(1).

Is there a more efficient way to do it? Yes. You can do it in O(n) time complexity provided you're willing to spend more space on it.

Upvotes: 12

Related Questions