user1442023
user1442023

Reputation: 73

deduplicating a sequence in clojure

I need to define a function which takes a sequence and some functions which act on elements inside the sequence. It returns a sequence from the old sequence where the elements with duplicate function values are removed.

(defn dedup [seq & functions] ...)

for example, if

(f1 1) = 'a'
(f1 2) = 'a'
(f1 3) = 'c'
(f1 4) = 'd'

(f2 1) = 'za'
(f2 2) = 'zb'
(f2 3) = 'zc'
(f2 4) = 'zb'

then

(dedup [1 2 3 4] f1 f2) 

returns a sequence of (1 3)

how do I do it?

EDIT: edited the test values so as not to create misunderstanding

EDIT: Below is the (not so functional) implementation for the case of only 1 function

(defn dedup [seq f]
  (loop [values #{} seq1 seq seq2 '()]
    (let [s (first seq1)]
      (if (nil? s)
        (reverse seq2)
        (let [v (f s)]
          (if (contains? values v)
            (recur values (rest seq1) seq2)
            (recur (conj values v) (rest seq1) (conj seq2 s))))))))

Upvotes: 0

Views: 423

Answers (1)

andrew cooke
andrew cooke

Reputation: 46882

your example seems to contradict the text - it is returning values where the two functions agree.

(defn dedup [seq & fns]
  (for [s seq :when (apply = (map #(% s) fns))] s))

(dedup [1 2 3 4] 
  #(case % 1 "a" 2 "a" 3 "c" 4 "d") 
  #(case % 1 "a" 2 "b" 3 "c" 4 "b"))
(1 3)

maybe that's a little too compact? #(... % ...) is equivalent to (fn [x] (... x ...)) and the map in dup runs over the functions, applying them all to the same value in the sequence.

you could also test with

(dedup [1 2 3 4] {1 "a" 2 "a" 3 "c" 4 "d"} {1 "a" 2 "b" 3 "c" 4 "b"})
(1 3)

ps i think maybe the confusion is over english meaning. "duplicate" means that a value repeats. so "a" "a" is a duplicate of "a". i suspect what you meant is "multiple" - you want to remove entries where you get multiple (distinct) values.

pps you could also use filter:

(defn dedup [seq & fns] 
  (filter #(apply = (map (fn [f] (f %)) fns)) seq))

where i needed to write one anonymous function explicitly because you can't nest #(...).

Upvotes: 3

Related Questions