densmnko
densmnko

Reputation: 13

Clojure: how to merge several sorted vectors of vectors into common structure?

Need your help. Stuck on an intuitively simple task.

I have a few vectors of vectors. The first element of each of the sub-vectors is a numeric key. All parent vectors are sorted by these keys. For example:

[[1 a b] [3 c d] [4 f d] .... ] 

[[1 aa bb] [2 cc dd] [3 ww qq] [5 f]... ]

[[3 ccc ddd] [4 fff ddd] ...]

Need to clarify that some key values in nested vectors may be missing, but sorting order guaranteed.

I need to merge all of these vectors into some unified structure by numeric keys. I also need to now, that a key was missed in original vector or vectors.

Like this:

[ [[1 a b][1 aa bb][]] [[][2 cc dd]] [[3 c d][3 ww qq][3 ccc ddd]] [[4 f d][][4 fff dd]]...]

Upvotes: 1

Views: 533

Answers (3)

Chris Tennant
Chris Tennant

Reputation: 190

You can break the problem down into two parts:

1) get the unique keys in sorted order

2) for each unique key, iterate through the list of vectors, and output either the entry for the key, or an empty list if missing

To get the unique keys, just pull out all the keys into lists, concat them into one big list, and then put them into a sorted-set:

(into 
  (sorted-set) 
    (apply concat
       (for [vector vectors]
         (map first vector))))

if we start with a list of vectors of:

(def vectors
  [[[1 'a 'b] [3 'c 'd] [4 'f 'd]] 
   [[1 'aa 'bb] [2 'cc 'dd] [3 'ww 'qq] [5 'f]]
   [[3 'ccc 'ddd] [4 'fff 'ddd]]])

then we get a sorted set of:

=> #{1 2 3 4 5}

so far so good. now for each key in that sorted set we need to iterate through the vectors, and get the entry with that key, or an empty list if it's missing. You can do that using two 'for' forms and then 'some' to find the entry (if present)

(for [k #{1 2 3 4 5}]
  (for [vector vectors]
    (or (some #(when (= (first %) k) %) vector )
        [])))

this yields:

=> (([1 a b] [1 aa bb] []) ([] [2 cc dd] []) ([3 c d] [3 ww qq] [3 ccc ddd]) ([4 f d] [] [4 fff ddd]) ([] [5 f] []))

which I think is what you want. (if you need vectors and not lists, just use "(into [] ...)" in the appropriate places.)

Putting it all together, we get:

(defn unify-vectors [vectors] 
  (for [k (into (sorted-set) 
                (apply concat
                       (for [vector vectors]
                         (map first vector))))]
    (for [vector vectors]
      (or (some #(when (= (first %) k) %) vector)
          []))))

(unify-vectors
 [[[1 'a 'b] [3 'c 'd] [4 'f 'd]] 
  [[1 'aa 'bb] [2 'cc 'dd] [3 'ww 'qq] [5 'f]]
  [[3 'ccc 'ddd] [4 'fff 'ddd]]])

=> (([1 a b] [1 aa bb] []) ([] [2 cc dd] []) ([3 c d] [3 ww qq] [3 ccc ddd]) ([4 f d] [] [4 fff ddd]) ([] [5 f] []))

Upvotes: 3

n2o
n2o

Reputation: 6477

I do not have a complete solution for you, but as a hint: use group-by to sort your vectors for the first argument.

This will be more idiomatic and maybe just a few lines when it is ready.

So you could write something like

(group-by first [[1 :a :b] [3 :c :d] [4 :f :d]])

and do this for all vectors. Then you can sort / merge them with the keys provided by group-by.

Upvotes: 3

albusshin
albusshin

Reputation: 4010

This is a simple workaround, but doesn't meet the best practices of Clojure Programming. Just to give a simple idea here.

(def vectors
  [
   [[1 'a 'b] [3 'c 'd] [4 'f 'd]] 
   [[1 'aa 'bb] [2 'cc 'dd] [3 'ww 'qq] [5 'f]]
   [[3 'ccc 'ddd] [4 'fff 'ddd]]]
  )

(loop [i 1
      result []] 
  (def sub-result [])
  (doseq [v vectors]
    (doseq [sub-v v] 
      (if 
        (= i (first sub-v))
        (def sub-result (into sub-result [sub-v]))))

    (if-not 
      (some #{i}
            (map first v))
      (def sub-result (into sub-result [[]]))

      ))
  (if (< i 6)
    (recur (inc i) (into result [sub-result]))
    (print result)))

Upvotes: 1

Related Questions