Shahar Shokrani
Shahar Shokrani

Reputation: 8762

Modify duplicate values with duplication index suffix (using Linq)

I have a list:

List<string> myList = new List<string>{ "dog", "cat", "dog", "bird" };

I want the output to be list of:

"dog (1)", "cat", "dog (2)", "bird"

I've already looked through this question but it is only talking about count the duplicates, my output should be with its duplicate index. like duplicate (index)

I've tried this code:

var q = list.GroupBy(x => x)
            .Where(y => y.Count()>1)
            .Select(g => new {Value = g.Key + "(" + g.Index + ")"})

but it does not seem to work because:

  1. Need to return all of my list back \ Or just modify the existing one. (and my answer returning only the duplicate ones)
  2. For duplicate values need to add a prefix based on their "duplicate index".

How to do this in C#? Is there a way using Linq?

Upvotes: 1

Views: 1427

Answers (4)

Rufus L
Rufus L

Reputation: 37050

Ok, @EricLippert challenged me and I couldn't let it go. Here's my second attempt, which I believe is much better performing and modifies the original list as requested. Basically we create a second list that contains all the duplicate entries in the first. Then we walk backwards through the first list, modifying any entries that have a counterpart in the duplicates list, and remove the item from the duplicates list each time we encounter one:

private static void Main()
{
    var myList = new List<string> {"dog", "cat", "dog", "bird"};
    var duplicates = myList.Where(item => myList.Count(i => i == item) > 1).ToList();

    for (var i = myList.Count - 1; i >= 0; i--)
    {
        var numDupes = duplicates.Count(item => item == myList[i]);
        if (numDupes <= 0) continue;
        duplicates.Remove(myList[i]);
        myList[i] += $" ({numDupes})";
    }

    Console.WriteLine(string.Join(", ", myList));

    Console.Write("\nDone!\nPress any key to exit...");
    Console.ReadKey();
}

Output

enter image description here

Upvotes: 1

Rotem
Rotem

Reputation: 21947

This solution is not quadratic with respect to size of list, and it modifies the list in place as preferred in OP.

Any efficient solution will involve a pre-pass in order to find and count the duplicates.

List<string> myList = new List<string>{ "dog", "cat", "dog", "bird" };

//map out a count of all the duplicate words in dictionary.
var counts = myList
    .GroupBy(s => s)
    .Where(p => p.Count() > 1)
    .ToDictionary(p => p.Key, p => p.Count());

//modify the list, going backwards so we can take advantage of our counts.
for (int i = myList.Count - 1; i >= 0; i--)
{
    string s = myList[i];
    if (counts.ContainsKey(s))
    {
        //add the suffix and decrement the number of duplicates left to tag.
        myList[i] += $" ({counts[s]--})";
    }
}

Upvotes: 1

Eric Lippert
Eric Lippert

Reputation: 660297

The accepted solution works but is extremely inefficient when the size of the list grows large.

What you want to do is first get the information you need in an efficient data structure. Can you implement a class:

sealed class Counter<T>
{
  public void Add(T item) { }
  public int Count(T item) { }
}

where Count returns the number of times (possibly zero) that Add has been called with that item. (Hint: you could use a Dictionary<T, int> to good effect.)

All right. Now that we have our useful helper we can:

var c1 = new Counter<string>();
foreach(string item in myList)
  c1.Add(item);

Great. Now we can construct our new list by making use of a second counter:

var result = new List<String>();
var c2 = new Counter<String>();
foreach(string item in myList)
{
  c2.Add(item);
  if (c1.Count(item) == 1))
    result.Add(item);
  else
    result.Add($"{item} ({c2.Count(item)})");
}

And we're done. Or, if you want to modify the list in place:

var c2 = new Counter<String>();
// It's a bad practice to mutate a list in a foreach, so
// we'll be sticklers and use a for.
for (int i = 0; i < myList.Count; i = i + 1)
{
  var item = myList[i];
  c2.Add(item);
  if (c1.Count(item) != 1))
    myList[i] = $"{item} ({c2.Count(item)})";
}

The lesson here is: create a useful helper class that solves one problem extremely well, and then use that helper class to make the solution to your actual problem more elegant. You need to count things to solve a problem? Make a thing-counter class.

Upvotes: 3

Rufus L
Rufus L

Reputation: 37050

One way to do this is to simply create a new list that contains the additional text for each item that appears more than once. When we find these items, we can create our formatted string using a counter variable, and increment the counter if the list of formatted strings contains that counter already.

Note that this is NOT a good performing solution. It was just the first thing that came to my head. But it's a place to start...

private static void Main()
{
    var myList = new List<string> { "dog", "cat", "dog", "bird" };

    var formattedItems = new List<string>();

    foreach (var item in myList)
    {
        if (myList.Count(i => i == item) > 1)
        {
            int counter = 1;
            while (formattedItems.Contains($"{item} ({counter})")) counter++;
            formattedItems.Add($"{item} ({counter})");
        }
        else
        {
            formattedItems.Add(item);
        }
    }

    Console.WriteLine(string.Join(", ", formattedItems));

    Console.Write("\nDone!\nPress any key to exit...");
    Console.ReadKey();
}

Output

enter image description here

Upvotes: 1

Related Questions