membersound
membersound

Reputation: 86905

How to get the key in Collectors.toMap merge function?

When a duplicate key entry is found during Collectors.toMap(), the merge function (o1, o2) is called.

Question: how can I get the key that caused the duplication?

String keyvalp = "test=one\ntest2=two\ntest2=three";

Pattern.compile("\n")
    .splitAsStream(keyval)
    .map(entry -> entry.split("="))
    .collect(Collectors.toMap(
        split -> split[0],
        split -> split[1],
        (o1, o2) -> {
            //TODO how to access the key that caused the duplicate? o1 and o2 are the values only
            //split[0]; //which is the key, cannot be accessed here
        },
    HashMap::new));

Inside the merge function I want to decide based on the key which if I cancel the mapping, or continue and take on of those values.

Upvotes: 24

Views: 10309

Answers (3)

levko
levko

Reputation: 19

There is, of course, simple and trivial trick - saving the key in the 'key mapper' function and getting the key in the 'merge' function. So, the code may look like the following (assuming the key is Integer):

final AtomicInteger key = new AtomicInteger(); 
...collect( Collectors.toMap( 
   item -> { key.set(item.getKey()); return item.getKey(); }, // key mapper 
   item -> ..., // value mapper
   (v1, v2) -> { log(key.get(), v1, v2); return v1; } // merge function
);

Note: this is not good for parallel processing.

Upvotes: 1

Holger
Holger

Reputation: 298539

The merge function has no chance to get the key, which is the same issue, the builtin function has, when you omit the merge function.

The solution is to use a different toMap implementation, which does not rely on Map.merge:

public static <T, K, V> Collector<T, ?, Map<K,V>>
    toMap(Function<? super T, ? extends K> keyMapper,
          Function<? super T, ? extends V> valueMapper) {
    return Collector.of(HashMap::new,
        (m, t) -> {
            K k = keyMapper.apply(t);
            V v = Objects.requireNonNull(valueMapper.apply(t));
            if(m.putIfAbsent(k, v) != null) throw duplicateKey(k, m.get(k), v);
        },
        (m1, m2) -> {
            m2.forEach((k,v) -> {
                if(m1.putIfAbsent(k, v)!=null) throw duplicateKey(k, m1.get(k), v);
            });
            return m1;
        });
}
private static IllegalStateException duplicateKey(Object k, Object v1, Object v2) {
    return new IllegalStateException("Duplicate key "+k+" (values "+v1+" and "+v2+')');
}

(This is basically what Java 9’s implementation of toMap without a merge function will do)

So all you need to do in your code, is to redirect the toMap call and omit the merge function:

String keyvalp = "test=one\ntest2=two\ntest2=three";

Map<String, String> map = Pattern.compile("\n")
        .splitAsStream(keyvalp)
        .map(entry -> entry.split("="))
        .collect(toMap(split -> split[0], split -> split[1]));

(or ContainingClass.toMap if its neither in the same class nor static imports)<\sup>

The collector supports parallel processing like the original toMap collector, though it’s not very likely to get a benefit from parallel processing here, even with more elements to process.

If, if I get you correctly, you only want to pick either, the older or newer value, in the merge function based on the actual key, you could do it with a key Predicate like this

public static <T, K, V> Collector<T, ?, Map<K,V>>
    toMap(Function<? super T, ? extends K> keyMapper,
          Function<? super T, ? extends V> valueMapper,
          Predicate<? super K> useOlder) {
    return Collector.of(HashMap::new,
        (m, t) -> {
            K k = keyMapper.apply(t);
            m.merge(k, valueMapper.apply(t), (a,b) -> useOlder.test(k)? a: b);
        },
        (m1, m2) -> {
            m2.forEach((k,v) -> m1.merge(k, v, (a,b) -> useOlder.test(k)? a: b));
            return m1;
        });
}
Map<String, String> map = Pattern.compile("\n")
        .splitAsStream(keyvalp)
        .map(entry -> entry.split("="))
        .collect(toMap(split -> split[0], split -> split[1], key -> condition));

There are several ways to customize this collector…

Upvotes: 6

Peter Lawrey
Peter Lawrey

Reputation: 533820

You need to use a custom collector or use a different approach.

Map<String, String> map = new Hashmap<>();
Pattern.compile("\n")
    .splitAsStream(keyval)
    .map(entry -> entry.split("="))
    .forEach(arr -> map.merge(arr[0], arr[1], (o1, o2) -> /* use arr[0]));

Writing a custom collector is rather more complicated. You need a TriConsumer (key and two values) is similar which is not in the JDK which is why I am pretty sure there is no built in function which uses. ;)

Upvotes: 6

Related Questions