Reputation: 24593
I'm comparing 2 ways to filter lists, with and without using streams. It turns out that the method without using streams is faster for a list of 10,000 items. I'm interested in understanding why is it so. Can anyone explain the results please?
public static int countLongWordsWithoutUsingStreams(
final List<String> words, final int longWordMinLength) {
words.removeIf(word -> word.length() <= longWordMinLength);
return words.size();
}
public static int countLongWordsUsingStreams(final List<String> words, final int longWordMinLength) {
return (int) words.stream().filter(w -> w.length() > longWordMinLength).count();
}
Microbenchmark using JMH:
@Benchmark
@BenchmarkMode(Throughput)
@OutputTimeUnit(MILLISECONDS)
public void benchmarkCountLongWordsWithoutUsingStreams() {
countLongWordsWithoutUsingStreams(nCopies(10000, "IAmALongWord"), 3);
}
@Benchmark
@BenchmarkMode(Throughput)
@OutputTimeUnit(MILLISECONDS)
public void benchmarkCountLongWordsUsingStreams() {
countLongWordsUsingStreams(nCopies(10000, "IAmALongWord"), 3);
}
public static void main(String[] args) throws RunnerException {
final Options opts = new OptionsBuilder()
.include(PracticeQuestionsCh8Benchmark.class.getSimpleName())
.warmupIterations(5).measurementIterations(5).forks(1).build();
new Runner(opts).run();
}
java -jar target/benchmarks.jar -wi 5 -i 5 -f 1
Benchmark
Mode Cnt Score Error Units
PracticeQuestionsCh8Benchmark.benchmarkCountLongWordsUsingStreams thrpt 5 10.219 ± 0.408 ops/ms
PracticeQuestionsCh8Benchmark.benchmarkCountLongWordsWithoutUsingStreams thrpt 5 910.785 ± 21.215 ops/ms
Edit: (as someone deleted the update posted as an answer)
public class PracticeQuestionsCh8Benchmark {
private static final int NUM_WORDS = 10000;
private static final int LONG_WORD_MIN_LEN = 10;
private final List<String> words = makeUpWords();
public List<String> makeUpWords() {
List<String> words = new ArrayList<>();
final Random random = new Random();
for (int i = 0; i < NUM_WORDS; i++) {
if (random.nextBoolean()) {
/*
* Do this to avoid string interning. c.f.
* http://en.wikipedia.org/wiki/String_interning
*/
words.add(String.format("%" + LONG_WORD_MIN_LEN + "s", i));
} else {
words.add(String.valueOf(i));
}
}
return words;
}
@Benchmark
@BenchmarkMode(AverageTime)
@OutputTimeUnit(MILLISECONDS)
public int benchmarkCountLongWordsWithoutUsingStreams() {
return countLongWordsWithoutUsingStreams(words, LONG_WORD_MIN_LEN);
}
@Benchmark
@BenchmarkMode(AverageTime)
@OutputTimeUnit(MILLISECONDS)
public int benchmarkCountLongWordsUsingStreams() {
return countLongWordsUsingStreams(words, LONG_WORD_MIN_LEN);
}
}
public static int countLongWordsWithoutUsingStreams(
final List<String> words, final int longWordMinLength) {
final Predicate<String> p = s -> s.length() >= longWordMinLength;
int count = 0;
for (String aWord : words) {
if (p.test(aWord)) {
++count;
}
}
return count;
}
public static int countLongWordsUsingStreams(final List<String> words,
final int longWordMinLength) {
return (int) words.stream()
.filter(w -> w.length() >= longWordMinLength).count();
}
Upvotes: 5
Views: 3730
Reputation: 44032
You do not return any values from your benchmark methods, thus, JMH has no chance to escape the computed values and your benchmark suffers dead code elimination. You compute how long it takes to do nothing. See the JMH page for further guidance.
Saying this, streams can be slower in some cases: Java 8: performance of Streams vs Collections
Upvotes: 2
Reputation: 28153
Whenever your benchmark says that some operation over 10000 elements takes 1ns (edit: 1µs), you probably found a case of clever JVM figuring out that your code doesn't actually do anything.
Collections.nCopies
doesn't actually make a list of 10000 elements. It makes a sort of a fake list with 1 element and a count of how many times it's supposedly there. That list is also immutable, so your countLongWordsWithoutUsingStreams
would throw an exception if there was something for removeIf
to do.
Upvotes: 5