merge-with merge: can this be simplified?

Question

I have a list of maps, and one of the keys in the map is a key that may repeat. I'd like to dedup/merge the list. So for example:

(def data [{:id 1 :a 2 :b 3 :c 4} {:id 1 :c 5 :d 6} {:id 2 :a 100 :b 101 :c 102} {:id 2 :a 103 :d 104} {:id 2 :a 200 :f 201}])

And I'd like to end up with:

[{:id 1 :a 2 :b 3 :c 5 :d 6} {:id 2 :a 200 :b 101 :c 102 :d 104 :f 201}]

(I've phrased the question so that merge/merge-with works, but the truth is I don't really care what happens with overlapping values; the first in, or the last in, can win).

What I've got is:

(vals (apply merge-with merge (into #(hash-map (:id %) %) data)))

Which does work, but I'm wondering if there's a better, more consise, or elegant, way of doing this. Also I wonder about performance because I think into is doing a full copy of the sequence, and forcing the entire thing into memory (the original data was a lazy sequence).

Francis Avila · Accepted Answer

If you know for sure that maps with the same :id will always be contiguous, you can use partition-by to create subsequences of the data by id and merge those subsequences:

(map (partial apply merge) (partition-by :id data))

This will be lazy and last-in will win.

merge-with merge: can this be simplified?

Answers (1)

Related Questions