djhaskin987
djhaskin987

Reputation: 10057

split a sequence by delimiter in clojure?

Say I have a sequence in clojure like

'(1 2 3 6 7 8)

and I want to split it up so that the list splits whenever an element divisible by 3 is encountered, so that the result looks like

'((1 2) (3) (6 7 8))

(EDIT: What I actually need is

[[1 2] [3] [6 7 8]]

, but I'll take the sequence version too : )

What is the best way to do this in clojure?

partition-by is no help:

(partition-by #(= (rem % 3) 0) '(1 2 3 6 7 8))
; => ((1 2) (3 6) (7 8))

split-with is close:

(split-with #(not (= (rem % 3) 0)) '(1 2 3 6 7 8))
; => [(1 2) (3 6 7 8)]

Upvotes: 4

Views: 1680

Answers (3)

Alan Thompson
Alan Thompson

Reputation: 29958

This is an interesting problem. I recently added a function split-using to the Tupelo library, which seems like a good fit here. I left the spyx debug statements in the code below so you can see how things progress:

(ns tst.clj.core
  (:use clojure.test tupelo.test)
  (:require
    [tupelo.core :as t]  ))
(t/refer-tupelo)

(defn start-segment? [vals]
  (zero? (rem (first vals) 3)))

(defn partition-using [pred vals-in]
  (loop [vals   vals-in
         result []]
    (if (empty? vals)
      result
      (t/spy-let [
          out-first               (take 1 vals)
          [out-rest unprocessed]  (split-using pred (spyx (next vals)))
          out-vals                (glue out-first out-rest)
          new-result              (append result out-vals)]
        (recur unprocessed new-result)))))

Which gives us output like:

out-first => (1)
(next vals) => (2 3 6 7 8)
[out-rest unprocessed] => [[2] (3 6 7 8)]
out-vals => [1 2]
new-result => [[1 2]]
out-first => (3)
(next vals) => (6 7 8)
[out-rest unprocessed] => [[] [6 7 8]]
out-vals => [3]
new-result => [[1 2] [3]]
out-first => (6)
(next vals) => (7 8)
[out-rest unprocessed] => [[7 8] ()]
out-vals => [6 7 8]
new-result => [[1 2] [3] [6 7 8]]

(partition-using start-segment? [1 2 3 6 7 8]) => [[1 2] [3] [6 7 8]]

or for a larger input vector:

(partition-using start-segment? [1 2 3 6 7 8 9 12 13 15 16 17 18 18 18 3 4 5])
   => [[1 2] [3] [6 7 8] [9] [12 13] [15 16 17] [18] [18] [18] [3 4 5]]

You could also create a solution using nested loop/recur, but that is already coded up in the split-using function:

(defn split-using   
  "Splits a collection based on a predicate with a collection argument.
  Finds the first index N such that (pred (drop N coll)) is true. Returns a length-2 vector
  of [ (take N coll) (drop N coll) ]. If pred is never satisified, [ coll [] ] is returned."
  [pred coll]
  (loop [left  []
         right (vec coll)]
    (if (or (empty? right) ; don't call pred if no more data
            (pred right))
      [left right]
      (recur  (append left (first right))
              (rest right)))))

Actually, the above function seems like it would be useful in the future. partition-using has now been added to the Tupelo library.

Upvotes: 2

leetwinski
leetwinski

Reputation: 17859

and one more old school reduce-based solution:

user> (defn split-all [pred items]
        (when (seq items)
          (apply conj (reduce (fn [[acc curr] x]
                                (if (pred x)
                                  [(conj acc curr) [x]]
                                  [acc (conj curr x)]))
                              [[] []] items))))
#'user/split-all

user> (split-all #(zero? (rem % 3)) '(1 2 3 6 7 8 10 11 12))
;;=> [[1 2] [3] [6 7 8 10 11] [12]]

Upvotes: 1

Derek Troy-West
Derek Troy-West

Reputation: 2479

Something like this?

(defn partition-with
  [f coll]
  (lazy-seq
    (when-let [s (seq coll)]
      (let [run (cons (first s) (take-while (complement f) (next s)))]
        (cons run (partition-with f (seq (drop (count run) s))))))))

(partition-with #(= (rem % 3) 0) [1 2 3 6 7 8 9 12 13 15 16 17 18])
=> ((1 2) (3) (6 7 8) (9) (12 13) (15 16 17) (18))

Upvotes: 4

Related Questions