Matthew Keron
Matthew Keron

Reputation: 133

Count of duplicate items in a C# list

I wanted to know how to count all the duplicate strings in a list in C# in winform Application.

List<string> colorList = new List<string> { "red", "red", "yellow", "blue", "blue", "orange", "green", "red" };

For example I have the above list and the count would be 5, because "red" appears 3 times and "blue" appears twice.

Happy to use loops or LINQ or anything necessary.

In my actual program this list can be quite larger with 1000s of entries so performance is something to consider also.

Thanks!

Upvotes: 3

Views: 16795

Answers (5)

Alvin Sartor
Alvin Sartor

Reputation: 2479

Not as fast as the accepted answer, but for the reference one can also use a dictionary to count the hits:

var map = new Dictionary<string, int>();
foreach (var color in colorList))
{
    if (map.ContainsKey(color)) map[color]++;
    else map.Add(color, 1);
}

return map.Values.Count(x => x > 1);

It's much faster than a LINQ GroupBy

Upvotes: 0

TanvirArjel
TanvirArjel

Reputation: 32119

If you just need the count of the duplicate items:

 List<string> colorList = new List<string> { "red", "red", "yellow", "blue", "blue", "orange", "green", "red" };

 var count = colorList.GroupBy(item => item)
                      .Where(item => item.Count() > 1)
                      .Sum(item => item.Count());

Try this for item by item details:

var result = colorList.GroupBy(item => item)
                      .Select(item => new
                          {
                              Name = item.Key,
                              Count = item.Count()
                          })
                      .OrderByDescending(item => item.Count)
                      .ThenBy(item => item.Name)
                      .ToList();

Upvotes: 7

Afnan Ahmad
Afnan Ahmad

Reputation: 2542

Another way of doing the count of the duplicates items in a C# can be as follow:-

 var duplicates = from d in list
 group d by d into c
 let count = c.Count()
 orderby count descending
 select new { Value = c.Key, Count = count };

  foreach (var v in duplicates)
  {
     string strValue = v.Value;
     int Count = v.Count;
  }

Upvotes: 1

Klaus G&#252;tter
Klaus G&#252;tter

Reputation: 12032

If you just need the total count:

var total = colorList.GroupBy(_ => _).Where(_ => _.Count() > 1).Sum(_ => _.Count());

An alternative which might be faster with large data sets:

var hashset = new HashSet<string>(); // to determine if we already have seen this color
var duplicates = new HashSet<string>(); // will contain the colors that are duplicates
var count = 0;
foreach (var color in colorList)
{
    if (!hashset.Add(color))
    {
        count++;
        if (duplicates.Add(color))
            count++;
    }
}

UPDATE: measured both methods with a list of 2^25 (approx. 30 million) entries: first one 3.7 seconds, second one 3.2 seconds.

Upvotes: 15

Praneet Nadkar
Praneet Nadkar

Reputation: 813

Well I would do it without group by

List<string> colorList = new List<string> { "red", "red", "yellow", "blue", "blue", "orange", "green", "red" };
        var count = 0;
        foreach (var item in colorList.Distinct().ToList())
        {
            var cnt = colorList.Count(i => i.Equals(item, StringComparison.InvariantCultureIgnoreCase));
            if (cnt > 1)
                count += cnt;

        }

Upvotes: 0

Related Questions