Reputation: 3110
I want to be able to remove multiple elements from a set while I am iterating over it. Initially I hoped that iterators were smart enough for the naive solution below to work.
Set<SomeClass> set = new HashSet<SomeClass>();
fillSet(set);
Iterator<SomeClass> it = set.iterator();
while (it.hasNext()) {
set.removeAll(setOfElementsToRemove(it.next()));
}
But this throws a ConcurrentModificationException
.
Note that iterator.remove() will not work as far as I can see because I need to remove multiple things at a time. Also assume that it is not possible to identify which elements to remove "on the fly", but it is possible to write the method setOfElementsToRemove()
. In my specific case it would take up a lot of memory and processing time to determine what to remove while iterating. Making copies is also not possible because of memory constraints.
setOfElementsToRemove()
will generate some set of SomeClass instances that I want to remove, and fillSet(set)
will fill the set with entries.
After searching Stack Overflow I could not find a good solution to this problem but a few hours break later I realized the following would do the job.
Set<SomeClass> set = new HashSet<SomeClass>();
Set<SomeClass> outputSet = new HashSet<SomeClass>();
fillSet(set);
while (!set.isEmpty()) {
Iterator<SomeClass> it = set.iterator();
SomeClass instance = it.next();
outputSet.add(instance);
set.removeAll(setOfElementsToRemoveIncludingThePassedValue(instance));
}
setOfElementsToRemoveIncludingThePassedValue()
will generate a set of elements to remove that includes the value passed to it. We need to remove the passed value so set
will empty.
My question is whether anyone has a better way of doing this or whether there are collection operations that support these kind of removals.
Also, I thought I would post my solution because there seems to be a need and I wanted to contribute the the excellent resource that is Stack Overflow.
Upvotes: 30
Views: 58119
Reputation: 106
Copied from the Java API:
The List interface provides a special iterator, called a ListIterator, that allows element insertion and replacement, and bidirectional access in addition to the normal operations that the Iterator interface provides. A method is provided to obtain a list iterator that starts at a specified position in the list.
I thought I would point out that the ListIterator which is a special kind of Iterator is built for replacement.
Upvotes: 0
Reputation: 463
You could try the java.util.concurrent.CopyOnWriteArraySet
which gives you an iterator that is a snapshot of the set at the time of the iterator creation. Any changes you make to the set (i.e. by calling removeAll()
) won't be visible in the iterator, but are visible if you look at the set itself (and removeAll()
won't throw).
Upvotes: 6
Reputation: 48659
It is possible to implement a Set
that allows its elements to be removed whilst iterating over it.
I think the standard implementations (HashSet, TreeSet etc.) disallow it because that means they can use more efficient algorithms, but it's not hard to do.
Here's an incomplete example using Google Collections:
import java.util.Iterator;
import java.util.Map;
import java.util.Set;
import java.util.concurrent.ConcurrentHashMap;
import com.google.common.base.Predicates;
import com.google.common.collect.ForwardingSet;
import com.google.common.collect.Iterators;
import com.google.common.collect.Sets;
public class ConcurrentlyModifiableSet<E>
extends ForwardingSet<E> {
/** Create a new, empty set */
public ConcurrentlyModifiableSet() {
Map<E, Boolean> map = new ConcurrentHashMap<E, Boolean>();
delegate = Sets.newSetFromMap(map);
}
@Override
public Iterator<E> iterator() {
return Iterators.filter(delegate.iterator(), Predicates.in(delegate));
}
@Override
protected Set<E> delegate() {
return this.delegate;
}
private Set<E> delegate;
}
Note: The iterator does not support the remove()
operation (but the example in the question does not require it.)
Upvotes: 0
Reputation: 8575
Normally when you remove an element from a collection while looping over the collection, you'll get a Concurrent Modification Exception. This is partially why the Iterator interface has a remove() method. Using an iterator is the only safe way to modify a collection of elements while traversing them.
The code would go something like this:
Set<SomeClass> set = new HashSet<SomeClass>();
fillSet(set);
Iterator<SomeClass> setIterator = set.iterator();
while (setIterator.hasNext()) {
SomeClass currentElement = setIterator.next();
if (setOfElementsToRemove(currentElement).size() > 0) {
setIterator.remove();
}
}
This way you'll safely remove all elements that generate a removal set from your setOfElementsToRemove().
EDIT
Based on a comment to another answer, this may be more what you want:
Set<SomeClass> set = new HashSet<SomeClass>();
Set<SomeClass> removalSet = new HashSet<SomeClass>();
fillSet(set);
for (SomeClass currentElement : set) {
removalSet.addAll(setOfElementsToRemove(currentElement);
}
set.removeAll(removalSet);
Upvotes: 41
Reputation: 67820
If you have enough memory for one copy of the set, I'll assume you also have enough memory for two copies. The Kafka-esque rules you cite don't seem to forbid that :)
My suggestion, then:
fillSet(set);
fillSet(copy);
for (Object item : copy) {
if (set.contains(item)) { // ignore if not
set.removeAll(setOfStuffToRemove())
}
}
so copy stays intact and just provides the stuff to loop on, while set suffers deletions. Stuff that was removed from set in the meantime will be ignored.
Upvotes: 2
Reputation: 2376
Instead of iterating through all the elements in the Set to remove the ones you want, you can actually use Google Collections (not something you can't do it on your own though) and apply a Predicate to mask the ones you don't need.
package com.stackoverflow.q1675037;
import java.util.HashSet;
import java.util.Set;
import org.junit.Assert;
import org.junit.Test;
import com.google.common.base.Predicate;
import com.google.common.collect.Iterables;
import com.google.common.collect.Sets;
public class SetTest
{
public void testFilter(final Set<String> original, final Set<String> toRemove, final Set<String> expected)
{
Iterable<String> mask = Iterables.filter(original, new Predicate<String>()
{
@Override
public boolean apply(String next) {
return !toRemove.contains(next);
}
});
HashSet<String> filtered = Sets.newHashSet(mask);
Assert.assertEquals(original.size() - toRemove.size(), filtered.size());
Assert.assertEquals(expected, filtered);
}
@Test
public void testFilterNone()
{
Set<String> original = new HashSet<String>(){
{
this.add("foo");
this.add("bar");
this.add("foobar");
}
};
Set<String> toRemove = new HashSet();
Set<String> expected = new HashSet<String>(){
{
this.add("foo");
this.add("bar");
this.add("foobar");
}
};
this.testFilter(original, toRemove, expected);
}
@Test
public void testFilterAll()
{
Set<String> original = new HashSet<String>(){
{
this.add("foo");
this.add("bar");
this.add("foobar");
}
};
Set<String> toRemove = new HashSet<String>(){
{
this.add("foo");
this.add("bar");
this.add("foobar");
}
};
HashSet<String> expected = new HashSet<String>();
this.testFilter(original, toRemove, expected);
}
@Test
public void testFilterOne()
{
Set<String> original = new HashSet<String>(){
{
this.add("foo");
this.add("bar");
this.add("foobar");
}
};
Set<String> toRemove = new HashSet<String>(){
{
this.add("foo");
}
};
Set<String> expected = new HashSet<String>(){
{
this.add("bar");
this.add("foobar");
}
};
this.testFilter(original, toRemove, expected);
}
@Test
public void testFilterSome()
{
Set<String> original = new HashSet<String>(){
{
this.add("foo");
this.add("bar");
this.add("foobar");
}
};
Set<String> toRemove = new HashSet<String>(){
{
this.add("bar");
this.add("foobar");
}
};
Set<String> expected = new HashSet<String>(){
{
this.add("foo");
}
};
this.testFilter(original, toRemove, expected);
}
}
Upvotes: 9
Reputation: 40851
Any solution that involves removing from the set you're iterating while you're iterating it, but not via the iterator, will absolutely not work. Except possibly one: you could use a Collections.newSetFromMap(new ConcurrentHashMap<SomeClass, Boolean>(sizing params))
. The catch is that now your iterator is only weakly consistent, meaning that each time you remove an element that you haven't encountered yet, it's undefined whether that element will show up later in your iteration or not. If that's not a problem, this might work for you.
Another thing you can do is build up a toRemove
set as you go instead, then set.removeAll(itemsToRemove);
only at the end. Or, copy the set before you start, so you can iterate one copy while removing from the other.
EDIT: oops, I see Peter Nix had already suggested the toRemove
idea (although with an unnecessarily hand-rolled removeAll
).
Upvotes: 6
Reputation: 45616
You should call Iterator.remove
method.
Also note, that on most java.util
collections the remove
method will generate exception if the contents of the collection have changed. So, if the code is multi-threaded use extra caution, or use concurrent collections.
Upvotes: 0
Reputation: 69412
Why don't you use the iterator's remove method on the objects you want to remove?
Iterators were introduced mainly because enumerators couldn't handle deleting while enumerating.
Upvotes: 1
Reputation: 103847
There's a simple answer to this - use the Iterator.remove() method.
Upvotes: 2