Daniel Wu
Daniel Wu

Reputation: 6003

how to process sorted data in clojure effectively?

I have a sequence of sorted data, and want to set the neighbor flag. e.g for the following data, for any element, if any neighbor has flag as 1, then set the any-neighbor-flagged as 1 for that element. We could define neighbor as whether the diff of the seq is <=2, if the diff<=2, then they are neighbor. There could be million of data point.

(def a '({:seq 1 :flag 1} {:seq 2 :flag 0} {:seq 5 :flag 0} {:seq 8 :flag 0} {:seq 10 :flag 1} {:seq 12 :flag 1}))

the expected result is:

({:seq 1 :any-neighbor-flagged 0} {:seq 2 :any-neighbor-flagged 1} {:seq 5 :any-neighbor-flagged 0} {:seq 8 :any-neighbor-flagged 1} 
{:seq 10 :any-neighbor-flagged 1} {:seq 12 :any-neighbor-flagged 1})

Upvotes: 0

Views: 46

Answers (2)

Jarlax
Jarlax

Reputation: 1576

Basic idea is to map over 3 sequences - original one, shifted by 1 to the left and shifted by one to the right:

(defn set-flags [coll]
  (map
    (fn [curr {nf :flag} {pf :flag}]
      (-> curr
          (dissoc :flag)
          (assoc :any-neighbor-flagged (if (or (= pf 1) (= nf 1)) 1 0))))
    coll
    (concat [{}] (drop-last coll))
    (concat (rest coll) [{}])))

(set-flags a) ; => ({:any-neighbor-flagged 0, :seq 1} {:any-neighbor-flagged 1, :seq 2} {:any-neighbor-flagged 0, :seq 5} {:any-neighbor-flagged 1, :seq 8} {:any-neighbor-flagged 1, :seq 10} {:any-neighbor-flagged 1, :seq 12})

Illustration (for simplicity, only value of :flag is displayed):

(1 0 0 0 [1] 1) ; original seq
---------------
  (1 0 0 [0] 1) ; shifted to right
(0 0 0 1 [1])   ; shifted to left

Now in map function we also have left neighbor and right neighbor for each element of input (possibly, empty maps). Based on that it's easy to set correct value for :any-neighbor-flagged.

Upvotes: 0

noisesmith
noisesmith

Reputation: 20194

With partition, we can look at a collection with neighboring context.

user=> (partition 3 1 (range 10))
((0 1 2) (1 2 3) (2 3 4) (3 4 5) (4 5 6) (5 6 7) (6 7 8) (7 8 9))

Given an input in that form, we can use reduce to accumulate a result based on neighbor comparisons.

user=> (pprint/pprint (reduce (fn [acc [i j k]]
                                  (conj acc
                                        (assoc j :any-neighbor-flagged
                                               (if (or (= (:flag i) 1)
                                                       (= (:flag k) 1))
                                                   1 0))))
                              []
                              (partition 3 1 (concat [nil] a [nil]))))
[{:any-neighbor-flagged 0, :seq 1, :flag 1}
 {:any-neighbor-flagged 1, :seq 2, :flag 0}
 {:any-neighbor-flagged 0, :seq 5, :flag 0}
 {:any-neighbor-flagged 1, :seq 8, :flag 0}
 {:any-neighbor-flagged 1, :seq 10, :flag 1}
 {:any-neighbor-flagged 1, :seq 12, :flag 1}]

Upvotes: 3

Related Questions