Reputation: 2577

Case insensitive string as HashMap key

I would like to use case insensitive string as a HashMap key for the following reasons.

During initialization, my program creates HashMap with user defined String
While processing an event (network traffic in my case), I might received String in a different case but I should be able to locate the <key, value> from HashMap ignoring the case I received from traffic.

I've followed this approach

CaseInsensitiveString.java

    public final class CaseInsensitiveString {
            private String s;

            public CaseInsensitiveString(String s) {
                            if (s == null)
                            throw new NullPointerException();
                            this.s = s;
            }

            public boolean equals(Object o) {
                            return o instanceof CaseInsensitiveString &&
                            ((CaseInsensitiveString)o).s.equalsIgnoreCase(s);
            }

            private volatile int hashCode = 0;

            public int hashCode() {
                            if (hashCode == 0)
                            hashCode = s.toUpperCase().hashCode();

                            return hashCode;
            }

            public String toString() {
                            return s;
            }
    }

LookupCode.java

    node = nodeMap.get(new CaseInsensitiveString(stringFromEvent.toString()));

Because of this, I'm creating a new object of CaseInsensitiveString for every event. So, it might hit performance.

Is there any other way to solve this issue?

Upvotes: 214

Answers (14)

ryvantage

Reputation: 13489

I find solutions which require you to change the key (e.g., toLowerCase) very unwelcome and solutions which require TreeMap also unwelcome.

Since TreeMap changes the time complexity (compared to other HashMaps), I think it's more viable to simply go with a utility method that is O(n):

public static <T> T getIgnoreCase(Map<String, T> map, String key) {
    for(Entry<String, T> entry : map.entrySet()) {
        if(entry.getKey().equalsIgnoreCase(key))
            return entry.getValue();
    }
    return null;
}

This is that method. Since the sacrifice to performance (time complexity) looks inevitable, at least this doesn't require you to change the underlying map to suit the lookup.

Upvotes: 2

Lawrence Kesteloot

Reputation: 4348

You can use CollationKey objects instead of strings:

Locale locale = ...;
Collator collator = Collator.getInstance(locale);
collator.setStrength(Collator.SECONDARY); // Case-insensitive.
collator.setDecomposition(Collator.FULL_DECOMPOSITION);

CollationKey collationKey = collator.getCollationKey(stringKey);
hashMap.put(collationKey, value);
hashMap.get(collationKey);

Use Collator.PRIMARY to ignore accent differences.

The CollationKey API does not guarantee that hashCode() and equals() are implemented, but in practice you'll be using RuleBasedCollationKey, which does implement these. If you're paranoid, you can use a TreeMap instead, which is guaranteed to work at the cost of O(log n) time instead of O(1).

Upvotes: 2

nikita91000

Reputation: 261

Instead of creating your own class to validate and store case insensitive string as a HashMap key, you can use:

LinkedCaseInsensitiveMap wraps a LinkedHashMap, which is a Map based on a hash table and a linked list. Unlike LinkedHashMap, it doesn't allow null key inserting. LinkedCaseInsensitiveMap preserves the original order as well as the original casing of keys while allowing calling functions like get and remove with any case.

Eg:

Map<String, Integer> linkedHashMap = new LinkedCaseInsensitiveMap<>();
linkedHashMap.put("abc", 1);
linkedHashMap.put("AbC", 2);

System.out.println(linkedHashMap);

Output: {AbC=2}

Mvn Dependency:

Spring Core is a Spring Framework module that also provides utility classes, including LinkedCaseInsensitiveMap.

<dependency>
    <groupId>org.springframework</groupId>
    <artifactId>spring-core</artifactId>
    <version>5.2.5.RELEASE</version>
</dependency>

CaseInsensitiveMap is a hash-based Map, which converts keys to lower case before they are being added or retrieved. Unlike TreeMap, CaseInsensitiveMap allows null key inserting.

Eg:

Map<String, Integer> commonsHashMap = new CaseInsensitiveMap<>();
commonsHashMap.put("ABC", 1);
commonsHashMap.put("abc", 2);
commonsHashMap.put("aBc", 3);

System.out.println(commonsHashMap);

Output: {abc=3}

Dependency:

<dependency>
    <groupId>org.apache.commons</groupId>
    <artifactId>commons-collections4</artifactId>
    <version>4.4</version>
</dependency>

TreeMap is an implementation of NavigableMap, which means that it always sorts the entries after inserting, based on a given Comparator. Also, TreeMap uses a Comparator to find if an inserted key is a duplicate or a new one.

Therefore, if we provide a case-insensitive String Comparator, we'll get a case-insensitive TreeMap.

Eg:

Map<String, Integer> treeMap = new TreeMap<>(String.CASE_INSENSITIVE_ORDER);
treeMap.put("ABC", 1);
treeMap.put("ABc", 2);
treeMap.put("cde", 1);
        
System.out.println(treeMap);

Output: {ABC=2, cde=1}

Upvotes: 1

codekitty

Reputation: 1438

I like using ICU4J’s CaseInsensitiveString wrap of the Map key because it takes care of the hash\equals and issue and it works for unicode\i18n.

HashMap<CaseInsensitiveString, String> caseInsensitiveMap = new HashMap<>();
caseInsensitiveMap.put("tschüß", "bye");
caseInsensitiveMap.containsKey("TSCHÜSS"); # true

Upvotes: 0

Vishal

Reputation: 20617

As suggested by Guido García in their answer here:

import java.util.HashMap;

public class CaseInsensitiveMap extends HashMap<String, String> {

    @Override
    public String put(String key, String value) {
       return super.put(key.toLowerCase(), value);
    }

    // not @Override because that would require the key parameter to be of type Object
    public String get(String key) {
       return super.get(key.toLowerCase());
    }
}

https://commons.apache.org/proper/commons-collections/apidocs/org/apache/commons/collections4/map/CaseInsensitiveMap.html

Upvotes: 62

Stephen C

Reputation: 718788

One approach is to create a custom subclass of the Apache Commons AbstractHashedMap class, overriding the hash and isEqualKeys methods to perform case insensitive hashing and comparison of keys. (Note - I've never tried this myself ...)

This avoids the overhead of creating new objects each time you need to do a map lookup or update. And the common Map operations should O(1) ... just like a regular HashMap.

And if you are prepared to accept the implementation choices they have made, the Apache Commons CaseInsensitiveMap does the work of customizing / specializing AbstractHashedMap for you.

But if O(logN) get and put operations are acceptable, a TreeMap with a case insensitive string comparator is an option; e.g. using String.CASE_INSENSITIVE_ORDER.

And if you don't mind creating a new temporary String object each time you do a put or get, then Vishal's answer is just fine. (Though, I note that you wouldn't be preserving the original case of the keys if you did that ...)

Upvotes: 19

Roel Spilker

Reputation: 34462

Map<String, String> nodeMap = 
    new TreeMap<>(String.CASE_INSENSITIVE_ORDER);

That's really all you need.

Upvotes: 394

Gabriel Belingueres

Reputation: 2975

Two choices come to my mind:

You could use directly the s.toUpperCase().hashCode(); as the key of the Map.
You could use a TreeMap<String> with a custom Comparator that ignore the case.

Otherwise, if you prefer your solution, instead of defining a new kind of String, I would rather implement a new Map with the required case insensibility functionality.

Upvotes: 5

Zdeněk Pavlas

Reputation: 357

Because of this, I'm creating a new object of CaseInsensitiveString for every event. So, it might hit performance.

Creating wrappers or converting key to lower case before lookup both create new objects. Writing your own java.util.Map implementation is the only way to avoid this. It's not too hard, and IMO is worth it. I found the following hash function to work pretty well, up to few hundred keys.

static int ciHashCode(String string)
{
    // length and the low 5 bits of hashCode() are case insensitive
    return (string.hashCode() & 0x1f)*33 + string.length();
}

Upvotes: 0

Cagatay Kalan

Reputation: 4126

This is an adapter for HashMaps which I implemented for a recent project. Works in a way similart to what @SandyR does, but encapsulates conversion logic so you don't manually convert strings to a wrapper object.

I used Java 8 features but with a few changes, you can adapt it to previous versions. I tested it for most common scenarios, except new Java 8 stream functions.

Basically it wraps a HashMap, directs all functions to it while converting strings to/from a wrapper object. But I had to also adapt KeySet and EntrySet because they forward some functions to the map itself. So I return two new Sets for keys and entries which actually wrap the original keySet() and entrySet().

One note: Java 8 has changed the implementation of putAll method which I could not find an easy way to override. So current implementation may have degraded performance especially if you use putAll() for a large data set.

Please let me know if you find a bug or have suggestions to improve the code.

package webbit.collections;

import java.util.*;
import java.util.function.*;
import java.util.stream.Collectors;
import java.util.stream.Stream;
import java.util.stream.StreamSupport;


public class CaseInsensitiveMapAdapter<T> implements Map<String,T>
{
    private Map<CaseInsensitiveMapKey,T> map;
    private KeySet keySet;
    private EntrySet entrySet;


    public CaseInsensitiveMapAdapter()
    {
    }

    public CaseInsensitiveMapAdapter(Map<String, T> map)
    {
        this.map = getMapImplementation();
        this.putAll(map);
    }

    @Override
    public int size()
    {
        return getMap().size();
    }

    @Override
    public boolean isEmpty()
    {
        return getMap().isEmpty();
    }

    @Override
    public boolean containsKey(Object key)
    {
        return getMap().containsKey(lookupKey(key));
    }

    @Override
    public boolean containsValue(Object value)
    {
        return getMap().containsValue(value);
    }

    @Override
    public T get(Object key)
    {
        return getMap().get(lookupKey(key));
    }

    @Override
    public T put(String key, T value)
    {
        return getMap().put(lookupKey(key), value);
    }

    @Override
    public T remove(Object key)
    {
        return getMap().remove(lookupKey(key));
    }

    /***
     * I completely ignore Java 8 implementation and put one by one.This will be slower.
     */
    @Override
    public void putAll(Map<? extends String, ? extends T> m)
    {
        for (String key : m.keySet()) {
            getMap().put(lookupKey(key),m.get(key));
        }
    }

    @Override
    public void clear()
    {
        getMap().clear();
    }

    @Override
    public Set<String> keySet()
    {
        if (keySet == null)
            keySet = new KeySet(getMap().keySet());
        return keySet;
    }

    @Override
    public Collection<T> values()
    {
        return getMap().values();
    }

    @Override
    public Set<Entry<String, T>> entrySet()
    {
        if (entrySet == null)
            entrySet = new EntrySet(getMap().entrySet());
        return entrySet;
    }

    @Override
    public boolean equals(Object o)
    {
        return getMap().equals(o);
    }

    @Override
    public int hashCode()
    {
        return getMap().hashCode();
    }

    @Override
    public T getOrDefault(Object key, T defaultValue)
    {
        return getMap().getOrDefault(lookupKey(key), defaultValue);
    }

    @Override
    public void forEach(final BiConsumer<? super String, ? super T> action)
    {
        getMap().forEach(new BiConsumer<CaseInsensitiveMapKey, T>()
        {
            @Override
            public void accept(CaseInsensitiveMapKey lookupKey, T t)
            {
                action.accept(lookupKey.key,t);
            }
        });
    }

    @Override
    public void replaceAll(final BiFunction<? super String, ? super T, ? extends T> function)
    {
        getMap().replaceAll(new BiFunction<CaseInsensitiveMapKey, T, T>()
        {
            @Override
            public T apply(CaseInsensitiveMapKey lookupKey, T t)
            {
                return function.apply(lookupKey.key,t);
            }
        });
    }

    @Override
    public T putIfAbsent(String key, T value)
    {
        return getMap().putIfAbsent(lookupKey(key), value);
    }

    @Override
    public boolean remove(Object key, Object value)
    {
        return getMap().remove(lookupKey(key), value);
    }

    @Override
    public boolean replace(String key, T oldValue, T newValue)
    {
        return getMap().replace(lookupKey(key), oldValue, newValue);
    }

    @Override
    public T replace(String key, T value)
    {
        return getMap().replace(lookupKey(key), value);
    }

    @Override
    public T computeIfAbsent(String key, final Function<? super String, ? extends T> mappingFunction)
    {
        return getMap().computeIfAbsent(lookupKey(key), new Function<CaseInsensitiveMapKey, T>()
        {
            @Override
            public T apply(CaseInsensitiveMapKey lookupKey)
            {
                return mappingFunction.apply(lookupKey.key);
            }
        });
    }

    @Override
    public T computeIfPresent(String key, final BiFunction<? super String, ? super T, ? extends T> remappingFunction)
    {
        return getMap().computeIfPresent(lookupKey(key), new BiFunction<CaseInsensitiveMapKey, T, T>()
        {
            @Override
            public T apply(CaseInsensitiveMapKey lookupKey, T t)
            {
                return remappingFunction.apply(lookupKey.key, t);
            }
        });
    }

    @Override
    public T compute(String key, final BiFunction<? super String, ? super T, ? extends T> remappingFunction)
    {
        return getMap().compute(lookupKey(key), new BiFunction<CaseInsensitiveMapKey, T, T>()
        {
            @Override
            public T apply(CaseInsensitiveMapKey lookupKey, T t)
            {
                return remappingFunction.apply(lookupKey.key,t);
            }
        });
    }

    @Override
    public T merge(String key, T value, BiFunction<? super T, ? super T, ? extends T> remappingFunction)
    {
        return getMap().merge(lookupKey(key), value, remappingFunction);
    }

    protected  Map<CaseInsensitiveMapKey,T> getMapImplementation() {
        return new HashMap<>();
    }

    private Map<CaseInsensitiveMapKey,T> getMap() {
        if (map == null)
            map = getMapImplementation();
        return map;
    }

    private CaseInsensitiveMapKey lookupKey(Object key)
    {
        return new CaseInsensitiveMapKey((String)key);
    }

    public class CaseInsensitiveMapKey {
        private String key;
        private String lookupKey;

        public CaseInsensitiveMapKey(String key)
        {
            this.key = key;
            this.lookupKey = key.toUpperCase();
        }

        @Override
        public boolean equals(Object o)
        {
            if (this == o) return true;
            if (o == null || getClass() != o.getClass()) return false;

            CaseInsensitiveMapKey that = (CaseInsensitiveMapKey) o;

            return lookupKey.equals(that.lookupKey);

        }

        @Override
        public int hashCode()
        {
            return lookupKey.hashCode();
        }
    }

    private class KeySet implements Set<String> {

        private Set<CaseInsensitiveMapKey> wrapped;

        public KeySet(Set<CaseInsensitiveMapKey> wrapped)
        {
            this.wrapped = wrapped;
        }


        private List<String> keyList() {
            return stream().collect(Collectors.toList());
        }

        private Collection<CaseInsensitiveMapKey> mapCollection(Collection<?> c) {
            return c.stream().map(it -> lookupKey(it)).collect(Collectors.toList());
        }

        @Override
        public int size()
        {
            return wrapped.size();
        }

        @Override
        public boolean isEmpty()
        {
            return wrapped.isEmpty();
        }

        @Override
        public boolean contains(Object o)
        {
            return wrapped.contains(lookupKey(o));
        }

        @Override
        public Iterator<String> iterator()
        {
            return keyList().iterator();
        }

        @Override
        public Object[] toArray()
        {
            return keyList().toArray();
        }

        @Override
        public <T> T[] toArray(T[] a)
        {
            return keyList().toArray(a);
        }

        @Override
        public boolean add(String s)
        {
            return wrapped.add(lookupKey(s));
        }

        @Override
        public boolean remove(Object o)
        {
            return wrapped.remove(lookupKey(o));
        }

        @Override
        public boolean containsAll(Collection<?> c)
        {
            return keyList().containsAll(c);
        }

        @Override
        public boolean addAll(Collection<? extends String> c)
        {
            return wrapped.addAll(mapCollection(c));
        }

        @Override
        public boolean retainAll(Collection<?> c)
        {
            return wrapped.retainAll(mapCollection(c));
        }

        @Override
        public boolean removeAll(Collection<?> c)
        {
            return wrapped.removeAll(mapCollection(c));
        }

        @Override
        public void clear()
        {
            wrapped.clear();
        }

        @Override
        public boolean equals(Object o)
        {
            return wrapped.equals(lookupKey(o));
        }

        @Override
        public int hashCode()
        {
            return wrapped.hashCode();
        }

        @Override
        public Spliterator<String> spliterator()
        {
            return keyList().spliterator();
        }

        @Override
        public boolean removeIf(Predicate<? super String> filter)
        {
            return wrapped.removeIf(new Predicate<CaseInsensitiveMapKey>()
            {
                @Override
                public boolean test(CaseInsensitiveMapKey lookupKey)
                {
                    return filter.test(lookupKey.key);
                }
            });
        }

        @Override
        public Stream<String> stream()
        {
            return wrapped.stream().map(it -> it.key);
        }

        @Override
        public Stream<String> parallelStream()
        {
            return wrapped.stream().map(it -> it.key).parallel();
        }

        @Override
        public void forEach(Consumer<? super String> action)
        {
            wrapped.forEach(new Consumer<CaseInsensitiveMapKey>()
            {
                @Override
                public void accept(CaseInsensitiveMapKey lookupKey)
                {
                    action.accept(lookupKey.key);
                }
            });
        }
    }

    private class EntrySet implements Set<Map.Entry<String,T>> {

        private Set<Entry<CaseInsensitiveMapKey,T>> wrapped;

        public EntrySet(Set<Entry<CaseInsensitiveMapKey,T>> wrapped)
        {
            this.wrapped = wrapped;
        }


        private List<Map.Entry<String,T>> keyList() {
            return stream().collect(Collectors.toList());
        }

        private Collection<Entry<CaseInsensitiveMapKey,T>> mapCollection(Collection<?> c) {
            return c.stream().map(it -> new CaseInsensitiveEntryAdapter((Entry<String,T>)it)).collect(Collectors.toList());
        }

        @Override
        public int size()
        {
            return wrapped.size();
        }

        @Override
        public boolean isEmpty()
        {
            return wrapped.isEmpty();
        }

        @Override
        public boolean contains(Object o)
        {
            return wrapped.contains(lookupKey(o));
        }

        @Override
        public Iterator<Map.Entry<String,T>> iterator()
        {
            return keyList().iterator();
        }

        @Override
        public Object[] toArray()
        {
            return keyList().toArray();
        }

        @Override
        public <T> T[] toArray(T[] a)
        {
            return keyList().toArray(a);
        }

        @Override
        public boolean add(Entry<String,T> s)
        {
            return wrapped.add(null );
        }

        @Override
        public boolean remove(Object o)
        {
            return wrapped.remove(lookupKey(o));
        }

        @Override
        public boolean containsAll(Collection<?> c)
        {
            return keyList().containsAll(c);
        }

        @Override
        public boolean addAll(Collection<? extends Entry<String,T>> c)
        {
            return wrapped.addAll(mapCollection(c));
        }

        @Override
        public boolean retainAll(Collection<?> c)
        {
            return wrapped.retainAll(mapCollection(c));
        }

        @Override
        public boolean removeAll(Collection<?> c)
        {
            return wrapped.removeAll(mapCollection(c));
        }

        @Override
        public void clear()
        {
            wrapped.clear();
        }

        @Override
        public boolean equals(Object o)
        {
            return wrapped.equals(lookupKey(o));
        }

        @Override
        public int hashCode()
        {
            return wrapped.hashCode();
        }

        @Override
        public Spliterator<Entry<String,T>> spliterator()
        {
            return keyList().spliterator();
        }

        @Override
        public boolean removeIf(Predicate<? super Entry<String, T>> filter)
        {
            return wrapped.removeIf(new Predicate<Entry<CaseInsensitiveMapKey, T>>()
            {
                @Override
                public boolean test(Entry<CaseInsensitiveMapKey, T> entry)
                {
                    return filter.test(new FromCaseInsensitiveEntryAdapter(entry));
                }
            });
        }

        @Override
        public Stream<Entry<String,T>> stream()
        {
            return wrapped.stream().map(it -> new Entry<String, T>()
            {
                @Override
                public String getKey()
                {
                    return it.getKey().key;
                }

                @Override
                public T getValue()
                {
                    return it.getValue();
                }

                @Override
                public T setValue(T value)
                {
                    return it.setValue(value);
                }
            });
        }

        @Override
        public Stream<Map.Entry<String,T>> parallelStream()
        {
            return StreamSupport.stream(spliterator(), true);
        }

        @Override
        public void forEach(Consumer<? super Entry<String, T>> action)
        {
            wrapped.forEach(new Consumer<Entry<CaseInsensitiveMapKey, T>>()
            {
                @Override
                public void accept(Entry<CaseInsensitiveMapKey, T> entry)
                {
                    action.accept(new FromCaseInsensitiveEntryAdapter(entry));
                }
            });
        }
    }

    private class EntryAdapter implements Map.Entry<String,T> {
        private Entry<String,T> wrapped;

        public EntryAdapter(Entry<String, T> wrapped)
        {
            this.wrapped = wrapped;
        }

        @Override
        public String getKey()
        {
            return wrapped.getKey();
        }

        @Override
        public T getValue()
        {
            return wrapped.getValue();
        }

        @Override
        public T setValue(T value)
        {
            return wrapped.setValue(value);
        }

        @Override
        public boolean equals(Object o)
        {
            return wrapped.equals(o);
        }

        @Override
        public int hashCode()
        {
            return wrapped.hashCode();
        }


    }

    private class CaseInsensitiveEntryAdapter implements Map.Entry<CaseInsensitiveMapKey,T> {

        private Entry<String,T> wrapped;

        public CaseInsensitiveEntryAdapter(Entry<String, T> wrapped)
        {
            this.wrapped = wrapped;
        }

        @Override
        public CaseInsensitiveMapKey getKey()
        {
            return lookupKey(wrapped.getKey());
        }

        @Override
        public T getValue()
        {
            return wrapped.getValue();
        }

        @Override
        public T setValue(T value)
        {
            return wrapped.setValue(value);
        }
    }

    private class FromCaseInsensitiveEntryAdapter implements Map.Entry<String,T> {

        private Entry<CaseInsensitiveMapKey,T> wrapped;

        public FromCaseInsensitiveEntryAdapter(Entry<CaseInsensitiveMapKey, T> wrapped)
        {
            this.wrapped = wrapped;
        }

        @Override
        public String getKey()
        {
            return wrapped.getKey().key;
        }

        @Override
        public T getValue()
        {
            return wrapped.getValue();
        }

        @Override
        public T setValue(T value)
        {
            return wrapped.setValue(value);
        }
    }


}

Upvotes: -1

Nikhil Nanivadekar

Reputation: 1152

You can use a HashingStrategy based Map from Eclipse Collections

HashingStrategy<String> hashingStrategy =
    HashingStrategies.fromFunction(String::toUpperCase);
MutableMap<String, String> node = HashingStrategyMaps.mutable.of(hashingStrategy);

Note: I am a contributor to Eclipse Collections.

Upvotes: 3

sinuhepop

Reputation: 20307

Based on other answers, there are basically two approaches: subclassing HashMap or wrapping String. The first one requires a little more work. In fact, if you want to do it correctly, you must override almost all methods (containsKey, entrySet, get, put, putAll and remove).

Anyway, it has a problem. If you want to avoid future problems, you must specify a Locale in String case operations. So you would create new methods (get(String, Locale), ...). Everything is easier and clearer wrapping String:

public final class CaseInsensitiveString {

    private final String s;

    public CaseInsensitiveString(String s, Locale locale) {
        this.s = s.toUpperCase(locale);
    }

    // equals, hashCode & toString, no need for memoizing hashCode
}

And well, about your worries on performance: premature optimization is the root of all evil :)

Upvotes: 2

le-doude

Reputation: 3367

Wouldn't it be better to "wrap" the String in order to memorize the hashCode. In the normal String class hashCode() is O(N) the first time and then it is O(1) since it is kept for future use.

public class HashWrap {
    private final String value;
    private final int hash;

    public String get() {
        return value;
    }

    public HashWrap(String value) {
        this.value = value;
        String lc = value.toLowerCase();
        this.hash = lc.hashCode();
    }

    @Override
    public boolean equals(Object o) {
        if (this == o) return true;
        if (o instanceof HashWrap) {
            HashWrap that = (HashWrap) o;
            return value.equalsIgnoreCase(that.value);
        } else {
            return false;
        }
    }

    @Override
    public int hashCode() {
        return this.hash;
    }

    //might want to implement compare too if you want to use with SortedMaps/Sets.
}

This would allow you to use any implementation of Hashtable in java and to have O(1) hasCode().

Upvotes: 4

Dave Newton

Reputation: 160191

Subclass HashMap and create a version that lower-cases the key on put and get (and probably the other key-oriented methods).

Or composite a HashMap into the new class and delegate everything to the map, but translate the keys.

If you need to keep the original key you could either maintain dual maps, or store the original key along with the value.

Upvotes: 7

Case insensitive string as HashMap key

Answers (14)

Related Questions