Reputation: 2455
I am looking for a join function which is like join in sql,for example:
Here is two list of maps:
(def a [{:user_id 1 :name "user 1"}
{:user_id 2 :name "user 2"}])
(def b [{:user_id 2 :email "e 2"}
{:user_id 1 :email "e 1"}])
I want join a and b on user_id to get:
[{:user_id 1 :name "user 1" :email "e 1"}
{:user_id 2 :name "user 2" :email "e 2"}]
Is there some function in clojure or other library which could achieve this?
Upvotes: 4
Views: 5031
Reputation: 1576
clojure.set/join will do the thing.
(require '[clojure.set :as set])
(set/join a b) ; => #{{:email "e 1", :name "user 1", :user_id 1} {:email "e 2", :name "user 2", :user_id 2}}
Without providing 3rd argument, function will join on all common keys:
(def a [{:id1 1 :id2 2 :name "n 1"} {:id1 2 :id2 3 :name "n 2"}])
(def b [{:id1 1 :id2 2 :url "u 1"} {:id1 2 :id2 4 :url "u 2"}])
(def c [{:id1 1 :id2 2 :url "u 1"} {:id1 2 :url "u 2"}]) ; :id2 is missing in 2nd record
(set/join a b) ; #{{:name "n 1", :url "u 1", :id1 1, :id2 2}}
(set/join a c) ; #{{:name "n 2", :url "u 2", :id1 2, :id2 3} {:name "n 1", :url "u 1", :id1 1, :id2 2}}
To join a and b only on id1:
(set/join a b {:id1 :id1}) ; #{{:name "n 2", :url "u 2", :id1 2, :id2 4} {:name "n 1", :url "u 1", :id1 1, :id2 2}}
We can even join by different keys from different collections:
(set/join a b {:id1 :id2}) ; #{{:name "n 2", :url "u 1", :id1 1, :id2 2}}
Upvotes: 10
Reputation: 20194
Another option, a bit simpler I think:
user=> (map #(apply merge %) (vals (group-by :user_id (concat a b))))
({:email "e 1", :name "user 1", :user_id 1} {:email "e 2", :name "user 2", :user_id 2})
group-by
creates a mapping from :user_id
to all the maps containing a given value, vals
gets only the values (each one a vector), and finally for each vector of values, they are merged.
Upvotes: 4
Reputation: 8854
I don't think there's any simple function that already does this, but I may be wrong.
If you know that each user_id
exists in each sequence, then you can just sort on user_id
, and then apply merge to corresponding maps:
(defn sort-by-user-id
[m]
(sort #(< (:user_id %1) (:user_id %2)) m))
(map merge (sort-by-user-id a) (sort-by-user-id b))
; => ({:email "e 1", :name "user 1", :user_id 1} {:email "e 2", :name "user 2", :user_id 2})
If you can't assume that all of the same user_id
s exist in each sequence, I think you'll need to do something slightly more complicated in order to match user_id
s. I'm assuming that if a name map has no corresponding email map, you want to leave the name map unchanged (or vice versa for missing name maps). If not, then one option would be to strip out those maps and use the method given above.
Here is one way to merge corresponding name and email maps. We can use the user_id
s as keys in a map of maps in order to match up corresponding maps. First create maps containing all of the maps with user_ids as keys, for example, like this:
(def az (zipmap (map :user_id a) a)) ; => {2 {:name "user 2", :user_id 2}, 1 {:name "user 1", :user_id 1}}
(def bz (zipmap (map :user_id b) b)) ; => {1 {:email "e 1", :user_id 1}, 2 {:email "e 2", :user_id 2}}
Then merge the individual maps like this, stripping out the keys at the end of the process:
(vals (merge-with merge az bz))
; => ({:email "e 2", :name "user 2", :user_id 2} {:email "e 1", :name "user 1", :user_id 1})
Putting it all together:
(defn map-of-maps
[cm]
(zipmap (map :user_id cm) cm))
(defn merge-maps
[& cms]
(vals
(apply merge-with merge
(map map-of-maps cms))))
Let's make sure that it works with missing user_id
s:
(def a+ (conj a {:name "user 3", :user_id 3}))
(def b+ (conj b {:email "e 4", :user_id 4}))
(merge-maps a+ b+)
; => ({:email "e 4", :user_id 4} {:name "user 3", :user_id 3} {:email "e 2", :name "user 2", :user_id 2} {:email "e 1", :name "user 1", :user_id 1})
I won't be surprised if there are simpler or more elegant methods. This just one strategy that occurred to me.
Upvotes: 1