Wong Sze Chung
Wong Sze Chung

Reputation: 13

Removing duplicates from arraylist

for (int i=0; i<name.size(); i++)
{
   for (int j = 1; j<name.size(); j++)
      if (name.get(i).equals(name.get(j)))
      {
         name.remove(i);
         name.remove(j);
         j=j-1;
      }
}

Initially, name is an ArrayList with 400 elements. I am trying to remove the duplicated elements. I don't know why my compiler keeps giving me

java.lang.IndexOutOfBoundsException: Index: 1, Size: 1

note that I am trying to remove the duplicated pair. There can only be two same element in the arraylist. 3 or more is not possible.

Upvotes: 0

Views: 1049

Answers (3)

Matt
Matt

Reputation: 3760

I wouldn't remove items from a list that you are iterating over. It is possible to adjust your indices but it makes for code that is hard to read.

Instead, you can use an Iterator and it will handle the indice adjustments for you.

Here's a quick example to illustrate the concept (I simplified your problem somewhat, I'm not checking for a duplicate in this case, just "bob"):

ArrayList<String> names = getNames(); // populate the list with some names
Iterator<String> iterator = names.iterator();
while(iterator.hasNext()) {
    String name = iterator.next();
    if(name.equals("bob")) {
        iterator.remove();
    }
}

However, to find duplicates, I would use a different method entirely. Rather than using a nested loop, I would use the Set collection. A set cannot contain duplicates, and if you try and add a duplicate to the set the add() method will return false.

If you iterate over your list, add each item to the set and check if the add() method returns false, you will know when you have a duplicate. You can either remove it from your list, or just keep the set at the end and use that for your collection of names with no duplicates.

There's a question here with an answer that illustrates this method. You will consume more space this way (you will have both a list and a set in memory) but you will save a lot of time by not iterating over the list every time you have to check for a duplicate. Depending on the size of your list this may or may not be ideal.

Identify duplicates in a List

EDIT: Actually, you could just take your list of names and bulk add them to the set and the duplicates would be stripped out:

Set<String> namesNoDuplicates = new HashSet<String>();
namesNoDuplicates.addAll(names);

Upvotes: 0

MadProgrammer
MadProgrammer

Reputation: 347334

When you remove an item from the list, the indices change, which is not only causing the IndexOutOfBounds, but could mean you're removing the wrong values

Now, there are going to be a multitude of ways you might achieve this, for example...

List<String> name = new ArrayList<>(Arrays.asList(new String[]{"a", "b", "a"}));
List<String> discard = new ArrayList<>(25);
for (int outter = 0; outter < name.size(); outter++) {
    String match = name.get(outter);
    discard.clear();
    for (int inner = outter + 1; inner < name.size(); inner++) {
        String to = name.get(inner);
        if (match.equals(to)) {
            discard.add(to);
        }
    }
    if (discard.size() > 0) {
        discard.add(match);
        name.removeAll(discard);
    }
}
System.out.println(name);

This prints...

[b]

This simply collects any matching elements within the inner loop and places them into another List, which is then passed to the removeAll method of the original List after the inner loop has completed

The inner loop starts at the current/outer index (plus 1), as we've already processed all the values before it, so we shouldn't need to continue looping over those items

In theory, you could simply keep adding elements to the discard List and do a single removeAll at the end instead

Updated...

So, I was wondering if there might be another way to approach the problem using Java 8 and it's Stream support...

List<String> name = new ArrayList<>(Arrays.asList(new String[]{"a", "b", "a"}));
Set<String> allItems = new HashSet<>();
List<String> duplicates = name.stream()
                .filter(n -> !allItems.add(n)) //Set.add() returns false if the item was already in the set.
                .collect(Collectors.toList());
name.removeAll(duplicates);

So, basically, what this does is it collects all the duplicates in the name List and places them into the duplicates List (using the allItems as a temporary holding point).

Then you can simply use it to call removeAll to remove all the duplicate items.

Now, this relies on the hashcode and equals implementations of the objects to work

Upvotes: 0

Paul Boddington
Paul Boddington

Reputation: 37655

I think this works. You had 2 small errors.

for (int i = 0; i < name.size(); i++)
{
    for (int j = i + 1; j < name.size(); j++)   // j needs to start at i + 1 not 1.
        if (name.get(i).equals(name.get(j)))
        {
            name.remove(j);                     // You need to remove at the higher index
            name.remove(i);                     // first, because items are shifted left.
            j = j - 1;
        }
}

Upvotes: 3

Related Questions