(def data {"Bob" {"A" 3.5 "B" 4.5 "C" 2.0}
"Jane" {"A" 2.0 "B" 1.5 "D" 4.0}})
(merge-with + (data "Bob") (data "Jane"))
{"A" 5.5, "B" 6.0, "C" 2.0 "D" 4.0}
I only want to create a merged map, but only for common keys. The result I'm looking for is
{"A" 5.5, "B" 6.0}
what's a good way to do this in clojure?
(defn reduce-merge [& maps]
(when (some identity maps)
(reduce #(select-keys (or %2 %1) (keys %1)) maps)))
Worked well for me, the or
is to swallow nils in the maps list. Does not handle deep merging or providing a function for collisions (which will always happen).
Performance-oriented solution using transients, reduce-kv
and a size check to iterate over the smaller map:
(defn merge-common-with [f m1 m2]
(let [[a b] (if (< (count m1) (count m2))
[m1 m2]
[m2 m1])]
(reduce-kv (fn [out k v]
(if (contains? b k)
(assoc! out k (f (get a k) (get b k)))
(transient {})
At the REPL, using sample data from the question text:
(merge-common-with + (data "Bob") (data "Jane"))
;= {"A" 5.5, "B" 6.0}
Note that while I expect the above to be the fastest approach in many circumstances, I'd definitely benchmark using data typical for your actual use case. Here's a Criterium-based benchmark using data
from the question text (merge-common-with
wins here):
(require '[criterium.core :as c])
(def a (data "Bob"))
(def b (data "Jane"))
;; Hendekagon's elegant approach amended to select-keys on both sides
(defn merge-common-with* [f a b]
(merge-with f
(select-keys a (keys b))
(select-keys b (keys a))))
;; benchmarks for three approaches follow, fastest to slowest
(c/bench (merge-common-with + a b))
Evaluation count : 74876640 in 60 samples of 1247944 calls.
Execution time mean : 783.233604 ns
Execution time std-deviation : 7.660391 ns
Execution time lower quantile : 771.514052 ns ( 2.5%)
Execution time upper quantile : 802.622953 ns (97.5%)
Overhead used : 1.266543 ns
Found 3 outliers in 60 samples (5.0000 %)
low-severe 3 (5.0000 %)
Variance from outliers : 1.6389 % Variance is slightly inflated by outliers
(c/bench (merge-matching + a b)) ; amalloy's approach
Evaluation count : 57320640 in 60 samples of 955344 calls.
Execution time mean : 1.047921 µs
Execution time std-deviation : 16.221173 ns
Execution time lower quantile : 1.025001 µs ( 2.5%)
Execution time upper quantile : 1.076081 µs (97.5%)
Overhead used : 1.266543 ns
(c/bench (merge-common-with* + a b))
WARNING: Final GC required 3.4556868188006065 % of runtime
Evaluation count : 33121200 in 60 samples of 552020 calls.
Execution time mean : 1.862483 µs
Execution time std-deviation : 26.008801 ns
Execution time lower quantile : 1.821841 µs ( 2.5%)
Execution time upper quantile : 1.914336 µs (97.5%)
Overhead used : 1.266543 ns
Found 1 outliers in 60 samples (1.6667 %)
low-severe 1 (1.6667 %)
Variance from outliers : 1.6389 % Variance is slightly inflated by outliers
Here is a fairly straightforward single-pass approach, which should outperform the multi-pass approaches so far suggested, without being particularly difficult to read:
(defn merge-matching [f a b]
(into {}
(for [[k v] a
:let [e (find b k)]
:when e]
[k (f v (val e))])))
If you always want to merge only two objects, you can also do something like this.
(into {}
(for [[kx vx] (data "Bob")
[ky vy] (data "Jane")
:when (= kx ky)]
{kx (+ vx vy)})))
If you want to merge multiple objects, they you can define this above code as a function and use reduce like this.
(defn merge-objects [obj1 obj2]
(into {} (for [[kx vx] obj1 [ky vy] obj2 :when (= kx ky)] {kx (+ vx vy)})))
(reduce merge-objects (map data ["Bob" "Jane"]))
I am not sure of any performance implications this might have since you are actually iterating on both the maps. But if your maps are small, you might not have to worry about it.
user> (let [data {"Bob" {"A" 3.5 "B" 4.5 "C" 2.0}
"Jane" {"A" 2.0 "B" 1.5 "D" 4.0}}
common-keys (apply clojure.set/intersection
(map (comp set keys second) data))]
(apply merge-with + (map #(select-keys % common-keys) (vals data))))
{"B" 6.0, "A" 5.5}
I generalized it a bit so it can be more agnostic of the incoming data
