Stefanvds
Stefanvds

Reputation: 5916

Find a dupe in a List with Linq

I am building a list of Users. each user has a FullName. I'm comparing users on FullName.

i'm taking a DataTable with the users from the old DB and parsing them to a 'User' Object. and adding them in a List<Users>. which in the code is a List<Deelnemer>

It goes like this:

    List<Deelnemer> tempDeeln = new List<Deelnemer>();
    bool dupes = false;
    foreach (DataRow rij in deeln.Rows) {
            Deelnemer dln = new Deelnemer();
            dln.Dln_Creatiedatum = DateTime.Now;
            dln.Dln_Email = rij["Ler_Email"].ToString();
            dln.Dln_Inst_ID = inst.Inst_ID;
            dln.Dln_Naam = rij["Ler_Naam"].ToString();
            dln.Dln_Username = rij["LerLog_Username"].ToString();
            dln.Dln_Voornaam = rij["Ler_Voornaam"].ToString();
            dln.Dln_Update = (DateTime)rij["Ler_Update"];
            if (!dupes && tempDeeln.Count(q => q.FullName.ToLower() == dln.FullName.ToLower()) > 0)
                dupes = true;
            tempDeeln.Add(dln);
     }

then when the foreach is done, i look if the bool is true, check which ones are the doubles, and remove the oldest ones.

now, i think this part of the code is very bad:

     if (!dupes && tempDeeln.Count(q => q.FullName.ToLower() == dln.FullName.ToLower()) > 0)

it runs for every user added, and runs over all the already created users.

my question: how would I optimize this.

Upvotes: 1

Views: 225

Answers (3)

Ani
Ani

Reputation: 113402

You can use a set such as a HashSet<T> to track unique names observed so far. A hash-set supports constant-time insertion and lookup, so a full linear-search will not be required for every new item unlike you exising solution.

var uniqueNames = new HashSet<string>(StringComparer.CurrentCultureIgnoreCase);
...

foreach(...)
{
   ...

   if(!dupes)
   {
       // Expression is true only if the set already contained the string.
       dupes = !uniqueNames.Add(dln.FullName); 
   }
}

If you want to "remove" dupes (i.e. produce one representative element for each name) after you have assembled the list (without using a hash-set), you can do:

var distinctItems = tempDeeln.GroupBy(dln => dln.FullName, 
                                        StringComparer.CurrentCultureIgnoreCase)
                             .Select(g => g.First());

Upvotes: 3

Vlad Bezden
Vlad Bezden

Reputation: 89507

Count will go through whole set of items. Try to use Any, this way it will only check for first occurrence of the item.

if (!dupes && tempDeeln.Any(q => q.FullName.ToLower() == dln.FullName.ToLower()))
            dupes = true;

Upvotes: 0

Related Questions