Reputation: 133
I wanted to know how to count all the duplicate strings in a list in C# in winform Application.
List<string> colorList = new List<string> { "red", "red", "yellow", "blue", "blue", "orange", "green", "red" };
For example I have the above list and the count would be 5, because "red" appears 3 times and "blue" appears twice.
Happy to use loops or LINQ or anything necessary.
In my actual program this list can be quite larger with 1000s of entries so performance is something to consider also.
Thanks!
Upvotes: 3
Views: 16795
Reputation: 2479
Not as fast as the accepted answer, but for the reference one can also use a dictionary to count the hits:
var map = new Dictionary<string, int>();
foreach (var color in colorList))
{
if (map.ContainsKey(color)) map[color]++;
else map.Add(color, 1);
}
return map.Values.Count(x => x > 1);
It's much faster than a LINQ GroupBy
Upvotes: 0
Reputation: 32119
If you just need the count of the duplicate items:
List<string> colorList = new List<string> { "red", "red", "yellow", "blue", "blue", "orange", "green", "red" };
var count = colorList.GroupBy(item => item)
.Where(item => item.Count() > 1)
.Sum(item => item.Count());
Try this for item by item details:
var result = colorList.GroupBy(item => item)
.Select(item => new
{
Name = item.Key,
Count = item.Count()
})
.OrderByDescending(item => item.Count)
.ThenBy(item => item.Name)
.ToList();
Upvotes: 7
Reputation: 2542
Another way of doing the count of the duplicates items in a C# can be as follow:-
var duplicates = from d in list
group d by d into c
let count = c.Count()
orderby count descending
select new { Value = c.Key, Count = count };
foreach (var v in duplicates)
{
string strValue = v.Value;
int Count = v.Count;
}
Upvotes: 1
Reputation: 12032
If you just need the total count:
var total = colorList.GroupBy(_ => _).Where(_ => _.Count() > 1).Sum(_ => _.Count());
An alternative which might be faster with large data sets:
var hashset = new HashSet<string>(); // to determine if we already have seen this color
var duplicates = new HashSet<string>(); // will contain the colors that are duplicates
var count = 0;
foreach (var color in colorList)
{
if (!hashset.Add(color))
{
count++;
if (duplicates.Add(color))
count++;
}
}
UPDATE: measured both methods with a list of 2^25 (approx. 30 million) entries: first one 3.7 seconds, second one 3.2 seconds.
Upvotes: 15
Reputation: 813
Well I would do it without group by
List<string> colorList = new List<string> { "red", "red", "yellow", "blue", "blue", "orange", "green", "red" };
var count = 0;
foreach (var item in colorList.Distinct().ToList())
{
var cnt = colorList.Count(i => i.Equals(item, StringComparison.InvariantCultureIgnoreCase));
if (cnt > 1)
count += cnt;
}
Upvotes: 0