Reputation: 2976
I have a List
of elements that have no measurable order among them. Their attributes are also complex enough that I cannot simply insert them into a Set
(since different attributes may represent the same element).
Through the course of my program, I analyse each element in the list and, based on it, other elements are added (like building a graph and going to each node to add additional paths and nodes). However, they added elements may be equivalent to other elements in the List
. In this case they are not added, and the equivalent element has a property changed (let's assume, a counter).
I've been using this code to find whether an equivalent state exists:
public static State stateAlreadyExists(State current) {
for (State any : list) {
if (equivalencyMethod(any, current)) {
return any;
}
}
return null;
}
However, this code, albeit having O(n)
complexity, is not performant enough for my case. Every element I add will have this code executed, and I add about sqrt(N)
elements per analysed element (so, analysing the element 400 is creating 20 new elements, for example). To increase the performance, I used Java's parallel streams:
public static State stateAlreadyExists(State current) {
Optional<State> opt = list.parallelStream().filter(
any -> equivalencyMethod(any, current)).findFirst();
if (opt.isPresent()) {
return opt.get();
}
return null;
}
And the performance increased significantly. The problem is this code isn't really equivalent, since we analyse the entire stream before returning an element. Most times, the equivalent element is in the first sqrt(N)
elements of the list, so a method that stops on the first match would be better.
I'm aware there is a noneFound()
method for streams. It returns as soon as a match is found. However, it returns a boolean
, not the element itself. Is there a way to use this or a similarly built method to return the first match found?
According to the JavaDoc, findFirst():
returns the first element of this stream
While findAny():
selects any element in the stream. This is to allow for maximal performance in parallel operations.
So, my code can become even more performant by using the findAny()
call, since order doesn't really matter for my problem, due to there being only 1 equivalent element at any time.
Upvotes: 1
Views: 204
Reputation: 35457
findFirst
is a short-circuiting terminal operation (thanks Keppil).
public static void main(String[] args) {
final AtomicInteger countNew = new AtomicInteger();
final AtomicInteger countDoStuff = new AtomicInteger();
class A {
A() { countNew.getAndIncrement(); }
public boolean doStuff() { return countDoStuff.getAndIncrement() % 3 == 2; }
}
Stream.generate(A::new).limit(20).filter(A::doStuff).findFirst();
System.out.println("Number of times an A was created: " + countNew);
System.out.println("Number of times doStuff was called: " + countDoStuff);
}
this code will print
Number of times an A was created: 3
Number of times doStuff was called: 3
but not
Number of times an A was created: 20
Number of times doStuff was called: 20
or even less
Number of times an A was created: 20
Number of times doStuff was called: 3
Upvotes: 2
Reputation: 6092
I would suggest you an alternative approach. Try to formulate your equivalency criteria in a way such that you can create a hash function that substantially reduces the number of pairwise equivalency checks that you have to do. For the hash function it's no problem if two non-equal items have the same hash, the only important is that two equal items have the same hash. Then you can simply store your elements in a HashSet
which would do the heavy work for you, you would only have to implement equals(Object other)
and hashCode()
of the elements.
If finding a proper hash code is impossible, you could still think if you are able to sort your objects, that is you can formulate a comparison function that can tell the order of any two of your objects (or that they are equal). Then you could use a TreeSet
with your custom comparator and that would also work super fast.
Upvotes: 1