Reputation: 1428
I've got a List<string>
that contains duplicates and I need to find the indexes of each.
What is the most elegant, efficient way other than looping through all the items. I'm on .NET 4.0 so LINQ is an option. I've done tons of searching and connect find anything.
Sample data:
var data = new List<string>{"fname", "lname", "home", "home", "company"}();
I need to get the indexes of "home".
Upvotes: 11
Views: 12754
Reputation: 485
I myself needed to find and remove the duplicates from list of strings. I first searched the indexes of duplicate items and then filtered the list in functional way using LINQ, without mutating the original list:
public static IEnumerable<string> RemoveDuplicates(IEnumerable<string> items)
{
var duplicateIndexes = items.Select((item, index) => new { item, index })
.GroupBy(g => g.item)
.Where(g => g.Count() > 1)
.SelectMany(g => g.Skip(1), (g, item) => item.index);
return items.Where((item, index) => !duplicateIndexes.Contains(index));
}
Upvotes: 0
Reputation: 166476
How about something like this
var data = new List<string>{"fname", "lname", "home", "home", "company"};
var duplicates = data
.Select((x, index) => new { Text = x, index})
.Where( x => ( data
.GroupBy(i => i)
.Where(g => g.Count() > 1)
.Select(g => g.Key).ToList()
).Contains(x.Text));
Upvotes: 0
Reputation: 700562
You can create an object from each item containing it's index, then group on the value and filter out the groups containing more than one object. Now you have a grouping list with objects containing the text and their original index:
var duplicates = data
.Select((t,i) => new { Index = i, Text = t })
.GroupBy(g => g.Text)
.Where(g => g.Count() > 1);
Upvotes: 24
Reputation: 8986
using System;
using System.Collections.Generic;
class Program
{
static void Main(string[] args)
{
var data = new List<string> { "fname", "lname", "home", "home", "company" };
foreach (var duplicate in FindDuplicates(data))
{
Console.WriteLine("Duplicate: {0} at index {1}", duplicate.Item1, duplicate.Item2);
}
}
public static IEnumerable<Tuple<T, int>> FindDuplicates<T>(IEnumerable<T> data)
{
var hashSet = new HashSet<T>();
int index = 0;
foreach (var item in data)
{
if (hashSet.Contains(item))
{
yield return Tuple.Create(item, index);
}
else
{
hashSet.Add(item);
}
index++;
}
}
}
Upvotes: 3