cyberz
cyberz

Reputation: 923

Map lookup performance

I'd like to do something using a map value for a given key only if the map contains the given key. Naively I would write:

Map<String, String> myMap = ...;

if(myMap.containsKey(key)) {
  String value = myMap.get(key);

  // Do things with value
}

The code above looks easy to understand, but from a performance point of view, wouldn't it be better the following code?

Map<String, String> myMap = ...;

String value = myMap.get(key);

if(value != null) {
  // Do things with value
}

In the second snippet I don't like the fact that value is declared with a wider scope.

How does the performance of given cases change with respect to the Map implementation?

Note: Let's assume that null values are not admitted in the map. I'm not talking about asymptotic complexity here, which is the same for both snippets

Upvotes: 11

Views: 21741

Answers (3)

James Dunn
James Dunn

Reputation: 8264

To answer your question,
"How does the performance of given cases change with respect to the Map implementation?"
The performance difference is negligible.

To comment on your comment,
"In the second snippet I don't like the fact that value is declared with a wider scope."
Good, you shouldn't. You see, there are two ways to get null returned from a Map:

  1. The key doesn't exist OR
  2. The key does exist, but its value is null (if the Map implementation allows null values, like HashMap).

So the two scenarios could actually have different results if the key existed with a null value!

EDIT

I wrote the following code to test out the performance of the two scenarios:

public class TestMapPerformance {

    static Map<String, String> myMap = new HashMap<String, String>();
    static int iterations = 7000000;

    // populate a map with seven million strings for keys
    static {
        for (int i = 0; i <= iterations; i++) {
            String tryIt = Integer.toString(i);
            myMap.put(tryIt, "hi");
        }
    }
    // run each scenario twice and print out the results.
    public static void main(String[] args) {
        System.out.println("Key Exists: " + testMap_CheckIfKeyExists(iterations));
        System.out.println("Value Null: " + testMap_CheckIfValueIsNull(iterations));
        System.out.println("Key Exists: " + testMap_CheckIfKeyExists(iterations));
        System.out.println("Value Null: " + testMap_CheckIfValueIsNull(iterations));
    }

    // Check if the key exists, then get its value  
    public static long testMap_CheckIfKeyExists(int iterations) {       
        Date date = new Date();
        for (int i = 0; i <= iterations; i++) {
            String key = Integer.toString(i);
            if(myMap.containsKey(key)) {
                String value = myMap.get(key);
                String newString = new String(value);
            }
        }
        return new Date().getTime() - date.getTime();
    }

    // Get the key's value, then check if that value is null
    public static long testMap_CheckIfValueIsNull(int iterations) {
        Date date = new Date();
        for (int i = 0; i <= iterations; i++) {
            String key = Integer.toString(i);
            String value = myMap.get(key);
            if(value != null) {
                String newString = new String(value);
            }
        }
        return new Date().getTime() - date.getTime();
    }

}

I ran it and this was the result:

Key Exists: 9901
Value Null: 11472
Key Exists: 11578
Value Null: 9387

So in conclusion, the difference in performance in negligible.

Upvotes: 4

Bernhard Barker
Bernhard Barker

Reputation: 55589

Map is an interface, so the implementing classes have quite a bit of freedom in how they implement each operation (it's entirely possible to write a class that buffers the last entry, which may allow constant time access for the get operation if it's the same as the last gotten object, making the two practically equivalent, except for a presumably required comparison).

For TreeMap and HashMap, for example, containsKey is essentially just a get operation (more specifically getEntry) with a check for null.

Thus, for these two containers, the first version should take roughly twice as long as the second (assuming you use the same type of Map in both cases).

Note that HashMap.get is O(1) (with a hash function well-suited to the data) and TreeMap.get is O(log n). So if you do any significant amount of work in the loop, and the Map doesn't contain in the order of millions of elements, the difference in performance is likely to be negligible.

However, note the disclaimer in the docs for Map.get:

If this map permits null values, then a return value of null does not necessarily indicate that the map contains no mapping for the key; it's also possible that the map explicitly maps the key to null. The containsKey operation may be used to distinguish these two cases.

Upvotes: 10

GerritCap
GerritCap

Reputation: 1616

Obviously the 2nd version is more performant: you only lookup the key in the map once while in the first version you look it up twice hence calculating twice the hashcode of the key and looking in the hashbuckets, assuming that you are using a hashmap of course.

You can have a completely different implementation of the Map interface that would be able to handle this kind of code much better by remembering the map entry that was linked to the key in the last contains method call, if the the subsequent get uses the same key (using the == operator) you can then immedialtely return the associated value from the remembered map entry.

However there is a danger in the 2nd method: what if I put this in the map:

map.put("null", null);

then map.get("null") would return null and you would treat it as "null" is not mapped while map.contains("null") would return true !

Upvotes: 1

Related Questions