Elsie
Elsie

Reputation: 21

Combine duplicated items in a list - c#

Similar to remove duplicate items from list in c#

I want to create a list then if a list item appears more than once, only treat it as one item, not duplicating it in the list and not ignoring it either.

Using the example from the ticket above: https://dotnetfiddle.net/NPqzne

List<MyClass> list = new List<MyClass>();

list.Add(new MyClass() { BillId = "123", classObj = {} });
list.Add(new MyClass() { BillId = "777", classObj = {} });
list.Add(new MyClass() { BillId = "999", classObj = {} });
list.Add(new MyClass() { BillId = "123", classObj = {} });

var result = myClassObject.GroupBy(x => x.BillId)
    .Where(x => x.Count() == 1)
    .Select(x => x.First());

Console.WriteLine(string.Join(", ", result.Select(x => x.BillId)));

How would I change that so results are

123, 777, 999 

rather than ignoring 123 altogether because it's a duplicate?

Upvotes: 2

Views: 245

Answers (5)

Gert Arnold
Gert Arnold

Reputation: 109080

There is an easy and standard way of preventing duplicates to be added to the list: use a HashSet and a custom IEqualityComparer.

The equality comparer should see MyClass object with the same BillId as equal. Using that specification, this comparer is generated (by Resharper):

sealed class BillEqualityComparer : IEqualityComparer<MyClass>
{
    public bool Equals(MyClass x, MyClass y)
    {
        if (ReferenceEquals(x, y)) return true;
        if (ReferenceEquals(x, null)) return false;
        if (ReferenceEquals(y, null)) return false;
        if (x.GetType() != y.GetType()) return false;
        return x.BillId == y.BillId;
    }

    public int GetHashCode(MyClass obj)
    {
        return obj.BillId.GetHashCode();
    }
}

Now the code only needs a slight modification:

HashSet<MyClass> hashSet = new HashSet<MyClass>(new BillEqualityComparer());

hashSet.Add(new MyClass() { BillId = "123", classObj = { } });
hashSet.Add(new MyClass() { BillId = "777", classObj = { } });
hashSet.Add(new MyClass() { BillId = "999", classObj = { } });
hashSet.Add(new MyClass() { BillId = "123", classObj = { } });

And you'll see that the last object isn't added (the output of the Add method is false).

I don't really see what "not duplicating it in the list and not ignoring it either" means in your view. You could check the output of hashSet.Add and, when false, do something with the ignored item.

Upvotes: 0

O. Jones
O. Jones

Reputation: 108641

To deduplicate a collection of instances of an arbitrary class, you first need to define what it means for two instances to be equal: that is, to duplicate one another. That's easy for simple data types like integers. It's a little harder for strings, because case-insensitivity is part of the picture.

For your arbitrary class, you make it implement the IEquatable interface. Once you have done that, you can make a HashSet of your instances. The process of inserting instances into that HashSet will remove the dupes.

Add this to your class definition to declare that it implements IEquatable.

public class MyClass : IEquatable<MyClass> {

Then implement an Equals method and some other methods in your class to implement the IEquatable interface. An example is here. VS has helpful features to assist you implementing all the methods you need.

If you need to be able to sort your instances, you can implement IComparable as well, then sort operations will work.

It's hard to give more specific advice about implementing those interfaces because you didn't describe MyClass in your question.

Upvotes: 0

Abbas
Abbas

Reputation: 1

You could use a Dictionary or HashSet instead of List, since these collection types don't allow duplicates:

Dictionary<string, MyClass> dict = new Dictionary<string, MyClass>(); 
dict.Add(123, new MyClass() { BillId = "123", classObj = {} }); 
dict.Add(777, new MyClass() { BillId = "777", classObj = {} }); 
dict.Add(999, new MyClass() { BillId = "999", classObj = {} }); 
dict.Add(123, new MyClass() { BillId = "123", classObj = {} });  // this 
//will not be added as the key is already present in the dictionary 

var result = dict.Select(x => x.Value); 
Console.WriteLine(string.Join(", ", result.Select(x => x.BillId))); //123,777, 999

Upvotes: 0

Dmitrii Bychenko
Dmitrii Bychenko

Reputation: 186668

Starting from .Net 6 you can try DistinctBy:

var result = myClassObject
  .DistinctBy(x => x.BillId)
  .ToList();

On older versions you can modify your current GroupBy solution (your don't want filtering .Where(x => x.Count() == 1) - we are not ignoring duplicatesm which have Count() > 1):

var result = myClassObject
  .GroupBy(x => x.BillId)
  .Select(x => x.First())
  .ToList();

Finally, no Linq solution with a help of HashSet<string>:

var result = new List<myClassObject>();

var unique = new HashSet<string>();

foreach (var item in myClassObject)
  if (unique.Add(item.BillId))
    result.Add(item);

Upvotes: 0

Vivek Nuna
Vivek Nuna

Reputation: 1

you can modify to these lines in your code, I have tried with your dotnetfiddle code. its working as expected.

var result = list.Select(x => x.BillId).Distinct();
Console.WriteLine(string.Join(", ", result.Select(x => x)));

You need to use Distinct to get the unique values.

Thank you for providing dotnetfiddle link, it helped in writing code easily.

Upvotes: 1

Related Questions