liwp
liwp

Reputation: 6926

Idiomatic way to recurse through collections in Clojure

I'm trying to understand what is the idiomatic way in Clojure to recurse through a tree or a list represented by a Clojure list (or another collection type).

I could write the following to count the elements in a flat collection (ignore the fact that it's not tail-recursive):

(defn length
  ([xs]
     (if (nil? (seq xs))
       0
       (+ 1 (length (rest xs))))))

Now in Scheme or CL all the examples only ever do this over lists, so the idiomatic base case test in those languages would be (nil? xs). In Clojure we'd like this function to work on all collection types, so is the idiomatic test (nil? (seq xs)), or maybe (empty? xs), or something completely different?

The other case I'd like to consider is tree traversal, i.e. traversing through a list or vector that represents a tree, e.g. [1 2 [3 4].

For example, counting the nodes in a tree:

(defn node-count [tree]
  (cond (not (coll? tree)) 1
        (nil? (seq tree)) 0
        :else (+ (node-count (first tree)) (node-count (rest tree)))))

Here we use (not (coll? tree)) to check for atoms, whereas in Scheme/CL we'd use atom?. We also use (nil? (seq tree)) to check for an empty collection. And finally we use first and rest to destructure the current tree to the left branch and the rest of the tree.

So to summarise, are the following forms idiomatic in Clojure:

Upvotes: 12

Views: 1690

Answers (2)

mikera
mikera

Reputation: 106351

I personally like the following approach to recurse through a collection:

(defn length
  "Calculate the length of a collection or sequence"
  ([coll]
     (if-let [[x & xs] (seq coll)]
       (+ 1 (length xs))
       0)))

Features:

  • (seq coll) is idiomatic for testing whether a collection is empty (as per Michal's great answer)
  • if-let with (seq coll) automatically handles both the nil and empty collection case
  • You can use destructuring to name the first and next elements as you like for use in your function body

Note that in general it is better to write recursive functions using recur if possible, so that you get the benefits of tail recursion and don't risk blowing up the stack. So with this in mind, I'd actually probably write this specific function as follows:

(defn length
  "Calculate the length of a collection or sequence"
  ([coll]
    (length coll 0))
  ([coll accumulator]
    (if-let [[x & xs] (seq coll)]
      (recur xs (inc accumulator))
      accumulator)))

(length (range 1000000))
=> 1000000

Upvotes: 9

Michał Marczyk
Michał Marczyk

Reputation: 84331

The idiomatic test for a non-empty seqable is (seq coll):

(if (seq coll)
  ...
  )

The nil? is unnecessary, since a non-nil return value from seq is guaranteed to be a seq and thus neither nil nor false and therefore truthy.

If you want to deal with the nil case first, you can change the if to if-not or seq to empty?; the latter is implemented as a composition of seq with not (which is why it is not idiomatic to write (not (empty? xs)), cf. the docstring of empty?).

As for first / rest -- it's useful to remember about the strict variant of rest, next, the use of which is more idiomatic than wrapping rest in a seq.

Finally, coll? checks if its argument is a Clojure persistent collection (an instance of clojure.lang.IPersistentCollection). Whether this is an appropriate check for "non-atoms" depends on whether the code needs to handle Java data structures as non-atoms (via interop): e.g. (coll? (java.util.HashSet.)) is false, as is (coll? (into-array [])), but you can call seq on both. There is a function called seqable? in core.incubator in the new modular contrib which promises to determine whether (seq x) would succeed for a given x.

Upvotes: 11

Related Questions