demi
demi

Reputation: 5504

Clojure: find repetition

Let's say we have a list of integers: 1, 2, 5, 13, 6, 5, 7 and I want to find the first number that repeats and return a vector of the two indices. In my sample, it's 5 at [2, 5]. What I did so far is loop, but can I do it more elegant, short way?

(defn get-cycle
  [xs]
  (loop [[x & xs_rest] xs, indices {}, i 0]
    (if (nil? x)
      [0 i]  ; Sequence is over before we found a duplicate.
      (if-let [x_index (indices x)]
        [x_index i]
        (recur xs_rest (assoc indices x i) (inc i))))))

No need to return number itself, because I can get it by index and, second, it may be not always there.

Upvotes: 2

Views: 1078

Answers (5)

Leon Grapenthin
Leon Grapenthin

Reputation: 9266

I know that you have only asked for the first. Here is a fully lazy implementation with little per-step allocation overhead

(defn dups
[coll]
(letfn [(loop-fn [idx [elem & rest] cached]
      (if elem
          (if-let [last-idx (cached elem)]
        (cons [last-idx idx]
              (lazy-seq (loop-fn (inc idx) rest (dissoc cached elem))))
        (lazy-seq (loop-fn (inc idx) rest (assoc cached elem idx))))))]
  (loop-fn 0 coll {})))

(first (dups v))
=> [2 5]

Edit: Here are some criterium benchmarks:

The answer that got accepted: 7.819269 µs

This answer (first (dups [12 5 13 6 5 7])): 6.176290 µs

Beschastnys: 5.841101 µs

first-duplicate: 5.025445 µs

Upvotes: 1

aldazosa
aldazosa

Reputation: 345

Here is a version using reduced to stop consuming the sequence when you find the first duplicate:

(defn first-duplicate [coll]
  (reduce (fn [acc [idx x]]
            (if-let [v (get acc x)]
              (reduced (conj v idx))
              (assoc acc x [idx])))
          {} (map-indexed #(vector % %2) coll)))

Upvotes: 4

jbm
jbm

Reputation: 2600

The intent of your code seems different from your description in the comments so I'm not totally confident I understand. That said, loop/recur is definitely a valid way to approach the problem.

Here's what I came up with:

(defn get-cycle [xs]
  (loop [xs xs index 0]
    (when-let [[x & more] (seq xs)]
      (when-let [[y] (seq more)]
        (if (= x y)
          {x [index (inc index)]}
          (recur more (inc index)))))))

This will return a map of the repeated item to a vector of the two indices the item was found at.

(get-cycle [1 1 2 1 2 4 2 1 4 5 6 7])
;=> {1 [0 1]}

(get-cycle [1 2 1 2 4 2 1 4 5 6 7 7])
;=> {7 [10 11]}

(get-cycle [1 2 1 2 4 2 1 4 5 6 7 8])
;=> nil

Here's an alternative solution using sequence functions. I like this way better but whether it's shorter or more elegant is probably subjective.

(defn pairwise [coll]
  (map vector coll (rest coll)))

(defn find-first [pred xs]
  (first (filter pred xs)))

(defn get-cycle [xs]
  (find-first #(apply = (val (first %)))
              (map-indexed hash-map (pairwise xs))))

Edited based on clarification from @demi

Ah, got it. Is this what you have in mind?

(defn get-cycle [xs]
  (loop [xs (map-indexed vector xs)]
    (when-let [[[i n] & more] (seq xs)]
      (if-let [[j _] (find-first #(= n (second %)) more)]
        {n [i j]}
        (recur more)))))

I re-used find-first from my earlier sequence-based solution.

Upvotes: 0

Leonid Beschastny
Leonid Beschastny

Reputation: 51470

Actually, loop is a pretty good choice unless you want to find all duplicates. Things like reduce will cause the full scan of an input sequence even when it's not necessary.

Here is my version of get-cycle:

(defn get-cycle [coll]
  (loop [i 0 seen {} coll coll]
    (when-let [[x & xs] (seq coll)]
      (if-let [j (seen x)]
        [j i]
        (recur (inc i) (assoc seen x i) xs)))))

The only difference from your get-cycle is that my version returns nil when there is no duplicates.

Upvotes: 0

traffichazard
traffichazard

Reputation: 1295

An option using list processing, but not significantly more concise:

(defn get-cycle [xs]
  (first (filter #(number? (first %))
    (reductions
      (fn [[m i] x] (if-let [xat (m x)] [xat i] [(assoc m x i) (inc i)]))
      [(hash-map) 0] xs))))

Upvotes: 6

Related Questions