zengr
zengr

Reputation: 38899

How to find duplicates in an ArrayList<Object>?

This is a pretty common question, but I could not find this part:

Say I have this array list:

List<MyDataClass> arrayList = new List<MyDataClass>;

MyDataClass{
   String name;
   String age;
}

Now, I need to find duplicates on the basis of age in MyDataClass and remove them. How is it possible using something like HashSet as described here?

I guess, we will need to overwrite equals in MyDataClass?

  1. But, what if I do not have the luxury of doing that?
  2. And How does HashSet actually internally find and does not add duplicates? I saw it's implementation here in OpenJDK but couldn't understand.

Upvotes: 10

Views: 34324

Answers (4)

saeed eivazi
saeed eivazi

Reputation: 914

Suppose that you have a class named Person that has two property: id , firstName. write this mehod in its class:

String uniqueAttributes() {
  return id + firstName;
}

The getDuplicates() method is now should be such as:

public static List<Person> getDuplicates(final List<Person> personList) {
  return getDuplicatesMap(personList).values().stream()
      .filter(duplicates -> duplicates.size() > 1)
      .flatMap(Collection::stream)
      .collect(Collectors.toList());
}

private static Map<String, List<Person>> getDuplicatesMap(List<Person> personList) {
  return personList.stream().collect(groupingBy(Person::uniqueAttributes));
}

Upvotes: 1

Harsha
Harsha

Reputation: 49

public Set<Object> findDuplicates(List<Object> list) {
        Set<Object> items = new HashSet<Object>();
        Set<Object> duplicates = new HashSet<Object>();
        for (Object item : list) {
            if (items.contains(item)) {
                duplicates.add(item);
                } else { 
                    items.add(item);
                    } 
            } 
        return duplicates;
        }

Upvotes: 0

aioobe
aioobe

Reputation: 420951

I'd suggest that you override both equals and hashCode (HashSet relies on both!)

To remove the duplicates you could simply create a new HashSet with the ArrayList as argument, and then clear the ArrayList and put back the elements stored in the HashSet.

class MyDataClass {
    String name;
    String age;

    @Override
    public int hashCode() {
        return name.hashCode() ^ age.hashCode();
    }

    @Override
    public boolean equals(Object obj) {
        if (!(obj instanceof MyDataClass))
            return false;

        MyDataClass mdc = (MyDataClass) obj;
        return mdc.name.equals(name) && mdc.age.equals(age);
    }
}

And then do

List<MyDataClass> arrayList = new ArrayList<MyDataClass>();

Set<MyDataClass> uniqueElements = new HashSet<MyDataClass>(arrayList);
arrayList.clear();
arrayList.addAll(uniqueElements);

But, what if I do not have the luxury of doing that?

Then I'd suggest you do some sort of decorator-class that does provide these methods.

class MyDataClassDecorator {

    MyDataClass mdc;

    public MyDataClassDecorator(MyDataClass mdc) {
        this.mdc = mdc;
    }

    @Override
    public int hashCode() {
        return mdc.name.hashCode() ^ mdc.age.hashCode();
    }

    @Override
    public boolean equals(Object obj) {
        if (!(obj instanceof MyDataClassDecorator))
            return false;

        MyDataClassDecorator mdcd = (MyDataClassDecorator) obj;
        return mdcd.mdc.name.equals(mdc.name) && mdcd.mdc.age.equals(mdc.age);
    }
}

Upvotes: 17

Daniel
Daniel

Reputation: 1527

And if you are not able to override "MyDataClass"'s hashCode and equals methods you could write a wrapper class that handles this.

Upvotes: 1

Related Questions