Reputation: 2511

clojure merge-with remove keys that are not common

(def data {"Bob"    {"A" 3.5  "B" 4.5 "C" 2.0}
           "Jane"   {"A" 2.0  "B" 1.5 "D" 4.0}})

calling

(merge-with + (data "Bob") (data "Jane"))

produces

 {"A" 5.5, "B" 6.0, "C" 2.0 "D" 4.0}

I only want to create a merged map, but only for common keys. The result I'm looking for is

   {"A" 5.5, "B" 6.0}

what's a good way to do this in clojure?

Upvotes: 0

Answers (6)

Adam

Reputation: 821

(defn reduce-merge [& maps]
  (when (some identity maps)
    (reduce #(select-keys (or %2 %1) (keys %1)) maps)))

Worked well for me, the or is to swallow nils in the maps list. Does not handle deep merging or providing a function for collisions (which will always happen).

Upvotes: 0

Michał Marczyk

Reputation: 84341

Performance-oriented solution using transients, reduce-kv and a size check to iterate over the smaller map:

(defn merge-common-with [f m1 m2]
  (let [[a b] (if (< (count m1) (count m2))
                [m1 m2]
                [m2 m1])]
    (persistent!
     (reduce-kv (fn [out k v]
                  (if (contains? b k)
                    (assoc! out k (f (get a k) (get b k)))
                    out))
                (transient {})
                a))))

At the REPL, using sample data from the question text:

(merge-common-with + (data "Bob") (data "Jane"))
;= {"A" 5.5, "B" 6.0}

Note that while I expect the above to be the fastest approach in many circumstances, I'd definitely benchmark using data typical for your actual use case. Here's a Criterium-based benchmark using data from the question text (merge-common-with wins here):

(require '[criterium.core :as c])

(def a (data "Bob"))
(def b (data "Jane"))

;; Hendekagon's elegant approach amended to select-keys on both sides
(defn merge-common-with* [f a b]
  (merge-with f
              (select-keys a (keys b))
              (select-keys b (keys a))))

;; benchmarks for three approaches follow, fastest to slowest

(c/bench (merge-common-with + a b))
Evaluation count : 74876640 in 60 samples of 1247944 calls.
             Execution time mean : 783.233604 ns
    Execution time std-deviation : 7.660391 ns
   Execution time lower quantile : 771.514052 ns ( 2.5%)
   Execution time upper quantile : 802.622953 ns (97.5%)
                   Overhead used : 1.266543 ns

Found 3 outliers in 60 samples (5.0000 %)
    low-severe   3 (5.0000 %)
 Variance from outliers : 1.6389 % Variance is slightly inflated by outliers

(c/bench (merge-matching + a b)) ; amalloy's approach
Evaluation count : 57320640 in 60 samples of 955344 calls.
             Execution time mean : 1.047921 µs
    Execution time std-deviation : 16.221173 ns
   Execution time lower quantile : 1.025001 µs ( 2.5%)
   Execution time upper quantile : 1.076081 µs (97.5%)
                   Overhead used : 1.266543 ns

(c/bench (merge-common-with* + a b))
WARNING: Final GC required 3.4556868188006065 % of runtime
Evaluation count : 33121200 in 60 samples of 552020 calls.
             Execution time mean : 1.862483 µs
    Execution time std-deviation : 26.008801 ns
   Execution time lower quantile : 1.821841 µs ( 2.5%)
   Execution time upper quantile : 1.914336 µs (97.5%)
                   Overhead used : 1.266543 ns

Found 1 outliers in 60 samples (1.6667 %)
    low-severe   1 (1.6667 %)
 Variance from outliers : 1.6389 % Variance is slightly inflated by outliers

Upvotes: 4

amalloy

Reputation: 91917

Here is a fairly straightforward single-pass approach, which should outperform the multi-pass approaches so far suggested, without being particularly difficult to read:

(defn merge-matching [f a b]
  (into {}
        (for [[k v] a
              :let [e (find b k)]
              :when e]
          [k (f v (val e))])))

Upvotes: 5

Udayakumar Rayala

Reputation: 2284

If you always want to merge only two objects, you can also do something like this.

(into {} 
  (for [[kx vx] (data "Bob") 
        [ky vy] (data "Jane") 
        :when (= kx ky)] 
    {kx (+ vx vy)})))

If you want to merge multiple objects, they you can define this above code as a function and use reduce like this.

(defn merge-objects [obj1 obj2] 
  (into {} (for [[kx vx] obj1 [ky vy] obj2  :when (= kx ky)] {kx (+ vx vy)})))

(reduce merge-objects (map data ["Bob" "Jane"]))

I am not sure of any performance implications this might have since you are actually iterating on both the maps. But if your maps are small, you might not have to worry about it.

Upvotes: 0

noisesmith

Reputation: 20194

user> (let [data {"Bob"    {"A" 3.5  "B" 4.5 "C" 2.0}
                  "Jane"   {"A" 2.0  "B" 1.5 "D" 4.0}}
            common-keys (apply clojure.set/intersection
                               (map (comp set keys second) data))]
        (apply merge-with + (map #(select-keys % common-keys) (vals data))))
{"B" 6.0, "A" 5.5}

I generalized it a bit so it can be more agnostic of the incoming data

Upvotes: 0

Hendekagon

Reputation: 4643

(merge-with merge-fn A (select-keys B (keys A)))

Upvotes: 1

clojure merge-with remove keys that are not common

Answers (6)

Related Questions