Reputation: 10285
Consider you have a map<String, Object> myMap
.
Given the expression "some.string.*"
, I have to retrieve all the values from myMap
whose keys starts with this expression.
I am trying to avoid for loop
s because myMap
will be given a set of expressions not only one and using for loop
for each expression becomes cumbersome performance wise.
What is the fastest way to do this?
Upvotes: 28
Views: 46620
Reputation: 19573
The accepted answer works in 99% of all the cases, but the devil is in the details.
Specifically, the accepted answer does not work when the map has a key which begins with the prefix, followed by Character.MAX_VALUE
followed by anything else. Comments posted to the accepted answer yields small improvements, but still does not cover all of the cases.
The following solution also uses NavigableMap to pick out a sub map given a key prefix. The solution is the subMapFrom()
method and the trick is to not bump/increment the last char of the prefix, rather, the last char which is not MAX_VALUE
whilst cutting off all trailing MAX_VALUE
s. So for example, if the prefix is "abc" we increment it to "abd". But if the prefix is "ab" + MAX_VALUE
we drop the last char and bump the preceding char instead, resulting in "ac".
import static java.lang.Character.MAX_VALUE;
public class App
{
public static void main(String[] args) {
NavigableMap<String, String> map = new TreeMap<>();
String[] keys = {
"a",
"b",
"b" + MAX_VALUE,
"b" + MAX_VALUE + "any",
"c"
};
// Populate map
Stream.of(keys).forEach(k -> map.put(k, ""));
// For each key that starts with 'b', find the sub map
Stream.of(keys).filter(s -> s.startsWith("b")).forEach(p -> {
System.out.println("Looking for sub map using prefix \"" + p + "\".");
// Always returns expected sub maps with no misses
// [b, b, bany], [b, bany] and [bany]
System.out.println("My solution: " +
subMapFrom(map, p).keySet());
// WRONG! Prefix "b" misses "bany"
System.out.println("SO answer: " +
map.subMap(p, true, p + MAX_VALUE, true).keySet());
// WRONG! Prefix "b" misses "b" and "bany"
System.out.println("SO comment: " +
map.subMap(p, true, tryIncrementLastChar(p), false).keySet());
System.out.println();
});
}
private static <V> NavigableMap<String, V> subMapFrom(
NavigableMap<String, V> map, String keyPrefix)
{
final String fromKey = keyPrefix, toKey; // undefined
// Alias
String p = keyPrefix;
if (p.isEmpty()) {
// No need for a sub map
return map;
}
// ("ab" + MAX_VALUE + MAX_VALUE + ...) returns index 1
final int i = lastIndexOfNonMaxChar(p);
if (i == -1) {
// Prefix is all MAX_VALUE through and through, so grab rest of map
return map.tailMap(p, true);
}
if (i < p.length() - 1) {
// Target char for bumping is not last char; cut out the residue
// ("ab" + MAX_VALUE + MAX_VALUE + ...) becomes "ab"
p = p.substring(0, i + 1);
}
toKey = bumpChar(p, i);
return map.subMap(fromKey, true, toKey, false);
}
private static int lastIndexOfNonMaxChar(String str) {
int i = str.length();
// Walk backwards, while we have a valid index
while (--i >= 0) {
if (str.charAt(i) < MAX_VALUE) {
return i;
}
}
return -1;
}
private static String bumpChar(String str, int pos) {
assert !str.isEmpty();
assert pos >= 0 && pos < str.length();
final char c = str.charAt(pos);
assert c < MAX_VALUE;
StringBuilder b = new StringBuilder(str);
b.setCharAt(pos, (char) (c + 1));
return b.toString();
}
private static String tryIncrementLastChar(String p) {
char l = p.charAt(p.length() - 1);
return l == MAX_VALUE ?
// Last character already max, do nothing
p :
// Bump last character
p.substring(0, p.length() - 1) + ++l;
}
}
Output:
Looking for sub map using prefix "b".
My solution: [b, b, bany]
SO answer: [b, b]
SO comment: [b, b, bany]
Looking for sub map using prefix "b".
My solution: [b, bany]
SO answer: [b, bany]
SO comment: []
Looking for sub map using prefix "bany".
My solution: [bany]
SO answer: [bany]
SO comment: [bany]
Should perhaps be added that I also tried various other approaches including code I found elsewhere on the internet. All of them failed by yielding an incorrect result or out right crashed with various exceptions.
Upvotes: 4
Reputation: 8798
Remove all keys which does not start with your desired prefix:
yourMap.keySet().removeIf(key -> !key.startsWith(keyPrefix));
Upvotes: 2
Reputation: 6040
If you work with NavigableMap (e.g. TreeMap), you can use benefits of underlying tree data structure, and do something like this (with O(lg(N))
complexity):
public SortedMap<String, Object> getByPrefix(
NavigableMap<String, Object> myMap,
String prefix ) {
return myMap.subMap( prefix, prefix + Character.MAX_VALUE );
}
More expanded example:
import java.util.NavigableMap;
import java.util.SortedMap;
import java.util.TreeMap;
public class Test {
public static void main( String[] args ) {
TreeMap<String, Object> myMap = new TreeMap<String, Object>();
myMap.put( "111-hello", null );
myMap.put( "111-world", null );
myMap.put( "111-test", null );
myMap.put( "111-java", null );
myMap.put( "123-one", null );
myMap.put( "123-two", null );
myMap.put( "123--three", null );
myMap.put( "123--four", null );
myMap.put( "125-hello", null );
myMap.put( "125--world", null );
System.out.println( "111 \t" + getByPrefix( myMap, "111" ) );
System.out.println( "123 \t" + getByPrefix( myMap, "123" ) );
System.out.println( "123-- \t" + getByPrefix( myMap, "123--" ) );
System.out.println( "12 \t" + getByPrefix( myMap, "12" ) );
}
private static SortedMap<String, Object> getByPrefix(
NavigableMap<String, Object> myMap,
String prefix ) {
return myMap.subMap( prefix, prefix + Character.MAX_VALUE );
}
}
Output is:
111 {111-hello=null, 111-java=null, 111-test=null, 111-world=null}
123 {123--four=null, 123--three=null, 123-one=null, 123-two=null}
123-- {123--four=null, 123--three=null}
12 {123--four=null, 123--three=null, 123-one=null, 123-two=null, 125--world=null, 125-hello=null}
Upvotes: 44
Reputation: 65811
I wrote a MapFilter
recently for just such a need. You can also filter filtered maps which makes then really useful.
If your expressions have common roots like "some.byte" and "some.string" then filtering by the common root first ("some." in this case) will save you a great deal of time. See main
for some trivial examples.
Note that making changes to the filtered map changes the underlying map.
public class MapFilter<T> implements Map<String, T> {
// The enclosed map -- could also be a MapFilter.
final private Map<String, T> map;
// Use a TreeMap for predictable iteration order.
// Store Map.Entry to reflect changes down into the underlying map.
// The Key is the shortened string. The entry.key is the full string.
final private Map<String, Map.Entry<String, T>> entries = new TreeMap<>();
// The prefix they are looking for in this map.
final private String prefix;
public MapFilter(Map<String, T> map, String prefix) {
// Store my backing map.
this.map = map;
// Record my prefix.
this.prefix = prefix;
// Build my entries.
rebuildEntries();
}
public MapFilter(Map<String, T> map) {
this(map, "");
}
private synchronized void rebuildEntries() {
// Start empty.
entries.clear();
// Build my entry set.
for (Map.Entry<String, T> e : map.entrySet()) {
String key = e.getKey();
// Retain each one that starts with the specified prefix.
if (key.startsWith(prefix)) {
// Key it on the remainder.
String k = key.substring(prefix.length());
// Entries k always contains the LAST occurrence if there are multiples.
entries.put(k, e);
}
}
}
@Override
public String toString() {
return "MapFilter(" + prefix + ") of " + map + " containing " + entrySet();
}
// Constructor from a properties file.
public MapFilter(Properties p, String prefix) {
// Properties extends HashTable<Object,Object> so it implements Map.
// I need Map<String,T> so I wrap it in a HashMap for simplicity.
// Java-8 breaks if we use diamond inference.
this(new HashMap<>((Map) p), prefix);
}
// Helper to fast filter the map.
public MapFilter<T> filter(String prefix) {
// Wrap me in a new filter.
return new MapFilter<>(this, prefix);
}
// Count my entries.
@Override
public int size() {
return entries.size();
}
// Are we empty.
@Override
public boolean isEmpty() {
return entries.isEmpty();
}
// Is this key in me?
@Override
public boolean containsKey(Object key) {
return entries.containsKey(key);
}
// Is this value in me.
@Override
public boolean containsValue(Object value) {
// Walk the values.
for (Map.Entry<String, T> e : entries.values()) {
if (value.equals(e.getValue())) {
// Its there!
return true;
}
}
return false;
}
// Get the referenced value - if present.
@Override
public T get(Object key) {
return get(key, null);
}
// Get the referenced value - if present.
public T get(Object key, T dflt) {
Map.Entry<String, T> e = entries.get((String) key);
return e != null ? e.getValue() : dflt;
}
// Add to the underlying map.
@Override
public T put(String key, T value) {
T old = null;
// Do I have an entry for it already?
Map.Entry<String, T> entry = entries.get(key);
// Was it already there?
if (entry != null) {
// Yes. Just update it.
old = entry.setValue(value);
} else {
// Add it to the map.
map.put(prefix + key, value);
// Rebuild.
rebuildEntries();
}
return old;
}
// Get rid of that one.
@Override
public T remove(Object key) {
// Do I have an entry for it?
Map.Entry<String, T> entry = entries.get((String) key);
if (entry != null) {
entries.remove(key);
// Change the underlying map.
return map.remove(prefix + key);
}
return null;
}
// Add all of them.
@Override
public void putAll(Map<? extends String, ? extends T> m) {
for (Map.Entry<? extends String, ? extends T> e : m.entrySet()) {
put(e.getKey(), e.getValue());
}
}
// Clear everything out.
@Override
public void clear() {
// Just remove mine.
// This does not clear the underlying map - perhaps it should remove the filtered entries.
for (String key : entries.keySet()) {
map.remove(prefix + key);
}
entries.clear();
}
@Override
public Set<String> keySet() {
return entries.keySet();
}
@Override
public Collection<T> values() {
// Roll them all out into a new ArrayList.
List<T> values = new ArrayList<>();
for (Map.Entry<String, T> v : entries.values()) {
values.add(v.getValue());
}
return values;
}
@Override
public Set<Map.Entry<String, T>> entrySet() {
// Roll them all out into a new TreeSet.
Set<Map.Entry<String, T>> entrySet = new TreeSet<>();
for (Map.Entry<String, Map.Entry<String, T>> v : entries.entrySet()) {
entrySet.add(new Entry<>(v));
}
return entrySet;
}
/**
* An entry.
*
* @param <T> The type of the value.
*/
private static class Entry<T> implements Map.Entry<String, T>, Comparable<Entry<T>> {
// Note that entry in the entry is an entry in the underlying map.
private final Map.Entry<String, Map.Entry<String, T>> entry;
Entry(Map.Entry<String, Map.Entry<String, T>> entry) {
this.entry = entry;
}
@Override
public String getKey() {
return entry.getKey();
}
@Override
public T getValue() {
// Remember that the value is the entry in the underlying map.
return entry.getValue().getValue();
}
@Override
public T setValue(T newValue) {
// Remember that the value is the entry in the underlying map.
return entry.getValue().setValue(newValue);
}
@Override
public boolean equals(Object o) {
if (!(o instanceof Entry)) {
return false;
}
Entry e = (Entry) o;
return getKey().equals(e.getKey()) && getValue().equals(e.getValue());
}
@Override
public int hashCode() {
return getKey().hashCode() ^ getValue().hashCode();
}
@Override
public String toString() {
return getKey() + "=" + getValue();
}
@Override
public int compareTo(Entry<T> o) {
return getKey().compareTo(o.getKey());
}
}
// Simple tests.
public static void main(String[] args) {
String[] samples = {
"Some.For.Me",
"Some.For.You",
"Some.More",
"Yet.More"};
Map map = new HashMap();
for (String s : samples) {
map.put(s, s);
}
Map all = new MapFilter(map);
Map some = new MapFilter(map, "Some.");
Map someFor = new MapFilter(some, "For.");
System.out.println("All: " + all);
System.out.println("Some: " + some);
System.out.println("Some.For: " + someFor);
Properties props = new Properties();
props.setProperty("namespace.prop1", "value1");
props.setProperty("namespace.prop2", "value2");
props.setProperty("namespace.iDontKnowThisNameAtCompileTime", "anothervalue");
props.setProperty("someStuff.morestuff", "stuff");
Map<String, String> filtered = new MapFilter(props, "namespace.");
System.out.println("namespace props " + filtered);
}
}
Upvotes: 5
Reputation: 16209
I used this code to do a speed trial:
public class KeyFinder {
private static Random random = new Random();
private interface Receiver {
void receive(String value);
}
public static void main(String[] args) {
for (int trials = 0; trials < 10; trials++) {
doTrial();
}
}
private static void doTrial() {
final Map<String, String> map = new HashMap<String, String>();
giveRandomElements(new Receiver() {
public void receive(String value) {
map.put(value, null);
}
}, 10000);
final Set<String> expressions = new HashSet<String>();
giveRandomElements(new Receiver() {
public void receive(String value) {
expressions.add(value);
}
}, 1000);
int hits = 0;
long start = System.currentTimeMillis();
for (String expression : expressions) {
for (String key : map.keySet()) {
if (key.startsWith(expression)) {
hits++;
}
}
}
long stop = System.currentTimeMillis();
System.out.printf("Found %s hits in %s ms\n", hits, stop - start);
}
private static void giveRandomElements(Receiver receiver, int count) {
for (int i = 0; i < count; i++) {
String value = String.valueOf(random.nextLong());
receiver.receive(value);
}
}
}
The output was:
Found 0 hits in 1649 ms
Found 0 hits in 1626 ms
Found 0 hits in 1389 ms
Found 0 hits in 1396 ms
Found 0 hits in 1417 ms
Found 0 hits in 1388 ms
Found 0 hits in 1377 ms
Found 0 hits in 1395 ms
Found 0 hits in 1399 ms
Found 0 hits in 1357 ms
This counts how many of 10000 random keys start with any one of 1000 random String values (10M checks).
So about 1.4 seconds on a simple dual core laptop; is that too slow for you?
Upvotes: 1
Reputation: 11958
map's keyset has no a special structure so I think you have to check each of the keys anyway. So you can't find a way which will be faster than a single loop...
Upvotes: 1