David Rz Ayala
David Rz Ayala

Reputation: 2235

How can I get the nested keys of a map in clojure?

if my structure is

{ :a :A
  :b :B
  :c {
       :d :D
     }
  :e {
       :f {
            :g :G
            :h :H
          }
     }
}

I would like to get a function called keys-in that returns something like:

[[:a] [:b] [:c :d] [:e :f :g] [:e :f :h]]

so then I can do something like:

(not-any? nil? (map #(get-in my-other-map %1) (keys-in my-map)))

So I can be sure that my-other-map has the same keys that my-map

Upvotes: 10

Views: 7029

Answers (12)

Alex Robbins
Alex Robbins

Reputation: 101

Here's a tree-seq-based version that's lazy and doesn't blow the stack.

(defn map-paths [m]
  (letfn [(branch? [[path v]] ((some-fn map? nil?) v))
          (children [[path m]] (for [[k v] m]
                                 [(conj path k) v]))]
    (remove branch?
            (tree-seq branch? children [[] m]))))

It produces a lazy sequence of path-value pairs (not just paths). Maps and nil are the only things considered a branch.

Note that branches can be empty: (map-paths {:foo :bar}) is the same as (map-paths {:foo :bar, :baz {}}). This preserves the property that (map-paths {:k foo}) is the same as (map-paths foo) with :k prepended to all of the paths.

(Kudos to @a-webb for demonstrating the applicability of zippers, which inspired this answer.)

Upvotes: 0

KingCode
KingCode

Reputation: 888

Here is a generic solution for known collection types, including maps (look for "Key Paths" on the Readme page for usage examples).

It handles mixed types as well (sequential types, maps and sets), and the API (protocols) can be extended to other types.

Upvotes: 0

miner49r
miner49r

Reputation: 1117

If you don't need a lazy result and just want to be fast, try using reduce-kv.

(defn keypaths
  ([m] (keypaths [] m ()))
  ([prev m result]
   (reduce-kv (fn [res k v] (if (map? v)
                              (keypaths (conj prev k) v res)
                              (conj res (conj prev k))))
              result
              m)))

If you also want to support vector indices (as with get-in or update-in), test with associative? instead of map?. If you want intermediate paths, you can conj those on too. Here's a variant:

(defn kvpaths-all2
  ([m] (kvpaths-all2 [] m ()))
  ([prev m result]
   (reduce-kv (fn [res k v] (if (associative? v)
                              (let [kp (conj prev k)]
                                (kvpaths-all2 kp v (conj res kp)))
                              (conj res (conj prev k))))
              result
              m)))

Upvotes: 5

user2762156
user2762156

Reputation: 31

Working on something similar for a personal project and this is my naive implementation:

(defn keys-in
  [m parent-keys]
  (mapcat (fn [[k v]]
        (if (map? v)
          (keys-in v (conj parent-keys k))
          (vector (conj parent-keys k v))))
      m))

Use it from the repl:

(keys-in <your-map> [])

Fancy way:

(map (comp vec drop-last) (keys-in <your-map> []))

Upvotes: 0

Mars
Mars

Reputation: 8854

Here are solutions (without intermediate paths) using Specter. They're by Nathan Marz, Specter's author, from a conversation on the Specter Slack channel (with his permission). I claim no credit for these definitions.

Simple version:

(defn keys-in [m]
  (let [nav (recursive-path [] p
              (if-path map?
                [ALL (collect-one FIRST) LAST p]
                STAY))]
    (map butlast (select nav m))))

More efficient version:

(defn keys-in [m]
  (let [nav (recursive-path [] p
              (if-path map?
                [ALL
                 (if-path [LAST map?]
                  [(collect-one FIRST) LAST p]
                  FIRST)]))]
    (select nav m)))

My informal explanation of what's happening in these definitions:

In the simple version, since the top-level argument is a map, if-path map? passes it to the first collection of navigators in brackets. These begin with ALL, which says here to do the rest for each element in the map. Then for each MapEntry in the map, (collect-one FIRST) says to add its first element (key) to the result of passing its last element (val) to if-path again. p was bound by recursive-path to be a reference to that same recursive-path expression. By this process we eventually get to a non-map. Return it and stop processing on that branch; that's what STAY means. However, this last thing returned is not one of the keys; it's the terminal val. So we end up with the leaf vals in each sequence. To strip them out, map butlast over the entire result.

The second version avoids this last step by only recursing into the val in the MapEntry if that val is itself a map. That's what the inner if-path does: [LAST map?] gets the last element, i.e. the val of the current MapEntry generated by ALL, and passes it to map?.


I used Criterium to test all of the key path functions on this page that don't return intermediate paths, plus one by noisesmith that's part of an answer to another question. For a 3-level, 3 keys per level map and for a 6-level, 6 keys per level map, miner49r's version and the second, faster Specter version have similar speeds, and are much faster than any of the other versions.

Timings on a 3-level, 3 keys per level (27 paths) map, in order:

  • miner49r's: 29.235649 µs
  • N. Marz's second Specter: 30.590085 µs
  • N. Marz's first Specter: 62.840230 µs
  • amalloy's: 75.740468 µs
  • noisesmith's (from the other question): 87.693425 µs
  • AWebb's: 162.281035 µs
  • AlexMiller's (without vec): 243.756275 µs

Timings on a 6-level, 6 keys per level (6^6 = 46656 paths) map, in order:

  • N. Marz's second Specter: 34.435956 ms
  • miner49r's: 37.897345 ms
  • N. Marz's first Specter: 119.600975 ms
  • noisesmith's: 180.448860 ms
  • amalloy's: 191.718783 ms
  • AWebb's: 193.172784 ms
  • AlexMiller's (without vec): 839.266448 ms

All calls were wrapped in doall so that lazy results would be realized. Since I was doalling them, I took out vec wrapper in Alex Miller's definition. Full details about timings can be found here. The test code is here.

(The simple Specter version is slower than the faster version because of the use of map butlast to strip out the leaf values. If this is step is removed, the simple Specter definition's times are similar to those of the second definition.)

Upvotes: 2

Laurent
Laurent

Reputation: 1698

Got a similar question, wasn't satisfied by current solutions:

"Naive" recursive approach

(require '[clojure.set :as set])

(defn all-paths
  ([m current]
   ;; base case: map empty or not a map
   (if (or (not (map? m)) (empty? m))
     #{current}
   ;; else: recursive call for every (key, value) in the map
     (apply set/union #{current}
            (map (fn [[k v]]
                   (all-paths v (conj current k)))
                 m))))
  ([m]
   (-> m (all-paths []) (disj []))))


(all-paths {:a 1
            :b 2
            :c {:ca 3
                :cb {:cba 4
                     :cbb 5}}
            :d {:da 6
                :db 7}})
=> #{[:a] [:b] [:c] [:d] [:c :ca] [:c :cb] [:d :da] [:d :db] [:c :cb :cba] [:c :cb :cbb]}

Upvotes: 2

Aaron Cummings
Aaron Cummings

Reputation: 11

Here is an implementation which returns all keys (not just the terminal keys) based on lazy-seq:

(defn keys-in
  ([m] (if (map? m) (keys-in (seq m) [])))
  ([es c]
   (lazy-seq
    (when-let [e (first es)]
      (let [c* (conj c (key e))]
        (cons c* (concat (if (map? (val e)) (keys-in (seq (val e)) c*))
                         (keys-in (rest es) c))))))))

Upvotes: 1

David Rz Ayala
David Rz Ayala

Reputation: 2235

This answer of mine is just to illustrate how NOT to do it since it is still procedural.

(defn keys-in [data] (genkeys [] data))

(defn genkeys [parent data]
  (let [mylist (transient [])]
    (doseq [k (keys data)]
      (do
        (if ( = (class (k data)) clojure.lang.PersistentHashMap )
          (#(reduce conj! %1 %2) mylist (genkeys (conj parent  k ) (k data) ))
          (conj! mylist  (conj parent  k ) )
          )))
    (persistent! mylist)))

Upvotes: 1

A. Webb
A. Webb

Reputation: 26466

Obligatory zippers version

(require '[clojure.zip :as z])

(defn keys-in [m] 
  (letfn [(branch? [[path m]] (map? m)) 
          (children [[path m]] (for [[k v] m] [(conj path k) v]))] 
    (if (empty? m) 
      []
      (loop [t (z/zipper branch? children nil [[] m]), paths []] 
        (cond (z/end? t) paths 
              (z/branch? t) (recur (z/next t), paths) 
              :leaf (recur (z/next t), (conj paths (first (z/node t)))))))))

Upvotes: 4

Alex Miller
Alex Miller

Reputation: 70239

(defn keys-in [m]
  (if (map? m)
    (vec 
     (mapcat (fn [[k v]]
               (let [sub (keys-in v)
                     nested (map #(into [k] %) (filter (comp not empty?) sub))]
                 (if (seq nested)
                   nested
                   [[k]])))
             m))
    []))

;; tests
user=> (keys-in nil)
[]
user=> (keys-in {})
[]
user=> (keys-in {:a 1 :b 2}))
[[:a] [:b]]
user=> (keys-in {:a {:b {:c 1}}})
[[:a :b :c]]
user=> (keys-in {:a {:b {:c 1}} :d {:e {:f 2}}})
[[:a :b :c] [:d :e :f]]

Upvotes: 16

Arthur Ulfeldt
Arthur Ulfeldt

Reputation: 91617

You can build this with clojure.zip or tree-seq fairly easily though I strongly prefer the prismatic.schema library for verifying the structure of nested maps

user> (def my-data-format                                 
  {:a Keyword                                             
   :b Keyword                                             
   :c {:d Keyword}                                        
   :e {:f {:g Keyword                                     
           :h Keyword}}})                                 
#'user/my-data-format                                     
user> (def some-data                                      
         {:a :A                                            
          :b :B                                            
          :c {:d :D}                                       
          :e {:f {:g :G                                    
                  :h :G}}})                                
#'user/some-data                                          
user> (schema/validate my-data-format some-data)          
{:a :A, :c {:d :D}, :b :B, :e {:f {:g :G, :h :G}}}
user> (def some-wrong-data
        {:a :A
         :b :B
         :c {:wrong :D}
         :e {:f {:g :G
                 :h :G}}})
 #'user/some-wrong-data             

 user> (schema/validate my-data-format some-wrong-data)  

ExceptionInfo Value does not match schema: 
{:c {:d missing-required-key, 
     :wrong disallowed-key}}  
schema.core/validate (core.clj:132)

Upvotes: 2

amalloy
amalloy

Reputation: 92147

(defn keys-in [m]
  (if (or (not (map? m))
          (empty? m))
    '(())
    (for [[k v] m
          subkey (keys-in v)]
      (cons k subkey))))

Upvotes: 9

Related Questions