Reputation: 28577
Given the following code, with two alternative ways to iterate through it,
is there any performance difference between these two methods?
Map<String, Integer> map = new HashMap<String, Integer>();
//populate map
//alt. #1
for (String key : map.keySet())
{
Integer value = map.get(key);
//use key and value
}
//alt. #2
for (Map.Entry<String, Integer> entry : map.entrySet())
{
String key = entry.getKey();
Integer value = entry.getValue();
//use key and value
}
I am inclined to think that alt. #2
is the more efficient means of iterating through the entire map
(but I could be wrong)
Upvotes: 68
Views: 91261
Reputation: 6760
Your second option is definitely more efficient since you are doing a lookup only once compared to n number of times in the first option.
But, nothing sticks better than trying it out when you can. So here goes -
(Not perfect but good enough to verify assumptions and on my machine anyway)
public static void main(String args[]) {
Map<String, Integer> map = new HashMap<String, Integer>();
// populate map
int mapSize = 500000;
int strLength = 5;
for(int i=0;i<mapSize;i++)
map.put(RandomStringUtils.random(strLength), RandomUtils.nextInt());
long start = System.currentTimeMillis();
// alt. #1
for (String key : map.keySet()) {
Integer value = map.get(key);
// use key and value
}
System.out.println("Alt #1 took "+(System.currentTimeMillis()-start)+" ms");
start = System.currentTimeMillis();
// alt. #2
for (Map.Entry<String, Integer> entry : map.entrySet()) {
String key = entry.getKey();
Integer value = entry.getValue();
// use key and value
}
System.out.println("Alt #2 took "+(System.currentTimeMillis()-start)+" ms");
}
RESULTS (Some interesting ones)
With int mapSize = 5000; int strLength = 5;
Alt #1 took 26 ms
Alt #2 took 20 ms
With int mapSize = 50000; int strLength = 5;
Alt #1 took 32 ms
Alt #2 took 20 ms
With int mapSize = 50000; int strLength = 50;
Alt #1 took 22 ms
Alt #2 took 21 ms
With int mapSize = 50000; int strLength = 500;
Alt #1 took 28 ms
Alt #2 took 23 ms
With int mapSize = 500000; int strLength = 5;
Alt #1 took 92 ms
Alt #2 took 57 ms
...and so on
Upvotes: 67
Reputation: 15487
The most efficient ways (according to my benchmark) are to use the new HashMap.forEach()
method added in Java 8 or HashMap.entrySet().forEach()
.
JMH Benchmark:
@Param({"50", "500", "5000", "50000", "500000"})
int limit;
HashMap<String, Integer> m = new HashMap<>();
public Test() {
}
@Setup(Level.Trial)
public void setup(){
m = new HashMap<>(m);
for(int i = 0; i < limit; i++){
m.put(i + "", i);
}
}
int i;
@Benchmark
public int forEach(Blackhole b){
i = 0;
m.forEach((k, v) -> { i += k.length() + v; });
return i;
}
@Benchmark
public int keys(Blackhole b){
i = 0;
for(String key : m.keySet()){ i += key.length() + m.get(key); }
return i;
}
@Benchmark
public int entries(Blackhole b){
i = 0;
for (Map.Entry<String, Integer> entry : m.entrySet()){ i += entry.getKey().length() + entry.getValue(); }
return i;
}
@Benchmark
public int keysForEach(Blackhole b){
i = 0;
m.keySet().forEach(key -> { i += key.length() + m.get(key); });
return i;
}
@Benchmark
public int entriesForEach(Blackhole b){
i = 0;
m.entrySet().forEach(entry -> { i += entry.getKey().length() + entry.getValue(); });
return i;
}
public static void main(String[] args) throws RunnerException {
Options opt = new OptionsBuilder()
.include(Test.class.getSimpleName())
.forks(1)
.warmupIterations(25)
.measurementIterations(25)
.measurementTime(TimeValue.milliseconds(1000))
.warmupTime(TimeValue.milliseconds(1000))
.timeUnit(TimeUnit.MICROSECONDS)
.mode(Mode.AverageTime)
.build();
new Runner(opt).run();
}
Results:
Benchmark (limit) Mode Cnt Score Error Units
Test.entries 50 avgt 25 0.282 ± 0.037 us/op
Test.entries 500 avgt 25 2.792 ± 0.080 us/op
Test.entries 5000 avgt 25 29.986 ± 0.256 us/op
Test.entries 50000 avgt 25 1070.218 ± 5.230 us/op
Test.entries 500000 avgt 25 8625.096 ± 24.621 us/op
Test.entriesForEach 50 avgt 25 0.261 ± 0.008 us/op
Test.entriesForEach 500 avgt 25 2.891 ± 0.007 us/op
Test.entriesForEach 5000 avgt 25 31.667 ± 1.404 us/op
Test.entriesForEach 50000 avgt 25 664.416 ± 6.149 us/op
Test.entriesForEach 500000 avgt 25 5337.642 ± 91.186 us/op
Test.forEach 50 avgt 25 0.286 ± 0.001 us/op
Test.forEach 500 avgt 25 2.847 ± 0.009 us/op
Test.forEach 5000 avgt 25 30.923 ± 0.140 us/op
Test.forEach 50000 avgt 25 670.322 ± 7.532 us/op
Test.forEach 500000 avgt 25 5450.093 ± 62.384 us/op
Test.keys 50 avgt 25 0.453 ± 0.003 us/op
Test.keys 500 avgt 25 5.045 ± 0.060 us/op
Test.keys 5000 avgt 25 58.485 ± 3.687 us/op
Test.keys 50000 avgt 25 1504.207 ± 87.955 us/op
Test.keys 500000 avgt 25 10452.425 ± 28.641 us/op
Test.keysForEach 50 avgt 25 0.567 ± 0.025 us/op
Test.keysForEach 500 avgt 25 5.743 ± 0.054 us/op
Test.keysForEach 5000 avgt 25 61.234 ± 0.171 us/op
Test.keysForEach 50000 avgt 25 1142.416 ± 3.494 us/op
Test.keysForEach 500000 avgt 25 8622.734 ± 40.842 us/op
As you can see, HashMap.forEach
and HashMap.entrySet().forEach()
perform best for large maps and are joined by the for loop on the entrySet()
for best performance on small maps.
The reason the keys methods are slower is probably because they have to lookup the value again for each entry, while the other methods just need to read a field in an object they already have to get the value. The reason that I would expect the iterator methods to be slower is that they are doing external iteration, which requires two method calls (hasNext
and next
) for each element as well as storing the iteration state in the iterator object, while the internal iteration done by forEach
requires just one method call to accept
.
You should profile on your target hardware with your target data and doing your target action in the loops to get a more accurate result.
Upvotes: 9
Reputation: 13574
bguiz,
I think (I don't know) that iterating the EntrySet (alternative 2) is marginally more efficient, simply because it doesn't hash each key in order to get it's value... Having said that, calculating the hash is an O(1) operation per entry, and therefore we're ONLY talking O(n) over the whole HashMap
... but note that all this applies to HashMap
only... other implementations of Map
may have VERY different performance characteristics.
I do think you'd be "pushing it" to actually NOTICE the difference in performance. If you are concerned then why not setup a test-case to time both iteration techniques?
If you don't have a REAL, reported performance issue, then you're really worrying about not very much... A few clock ticks here and there won't affect the overall usability of your program.
I believe that many, many other aspects of the code are typically more important than outright performance. Of course some blocks are "performance critical", and this is known BEFORE it's even written, let-alone performance tested... but such cases are fairly rare. As a general approach it's better to focus on writing complete, correct, flexible, testable, reusable, readable, maintainable code... performance CAN be built in later, as need arises.
Version 0 should be AS SIMPLE AS POSSIBLE, without any "optimizations".
Upvotes: 2
Reputation: 56616
Map:
Map<String, Integer> map = new HashMap<String, Integer>();
Beside the 2 options, there is one more.
1) keySet() - use it if you need to use only the keys
for ( String k : map.keySet() ) {
...
}
2) entrySet() - use it if you need both: keys & values
for ( Map.Entry<String, Integer> entry : map.entrySet() ) {
String k = entry.getKey();
Integer v = entry.getValue();
...
}
3) values() - use it if you need only the values
for ( Integer v : map.values() ) {
...
}
Upvotes: 10
Reputation: 74750
In general, the second one would be a bit faster for a HashMap. It will only really matter if you have lots of hash collisions, since then the get(key)
call gets slower than O(1)
- it gets O(k)
with k
being the number of entries in the same bucket (i.e. the number of keys with same hash code or a different hash code which gets still mapped to the same bucket - this depends on the capacity, size and load factor of the map as well).
The Entry-iterating variant does not have to do the lookup, thus it gets a bit faster here.
Another note: If the capacity of your map is a lot bigger than the actual size and you use iterations a lot, you might consider using LinkedHashMap instead. It provides O(size)
instead O(size+capacity)
complexity for a complete iteration (as well as a predictable iteration order). (You should still measure if this really gives an improvement, since the factors might vary. LinkedHashMap has a bigger overhead for creating the map.)
Upvotes: 3
Reputation: 887215
The second snippet will be slightly faster, since it doesn't need to re-look-up the keys.
All HashMap
iterators call the nextEntry
method, which returns an Entry<K,V>
.
Your first snippet discards the value from the entry (in KeyIterator
), then looks it up again in the dictionary.
Your second snippet uses the key and value directly (from EntryIterator
)
(Both keySet()
and entrySet()
are cheap calls)
Upvotes: 11
Reputation: 5268
The latter is more efficient than the former. A tool like FindBugs will actually flag the former and suggest you to do the latter.
Upvotes: 5