Amogh Talpallikar
Amogh Talpallikar

Reputation: 12184

How to stop a reduce function from processing the list once the desired accumulation has been reached?

In the procedural world, if I have to find the first item of a list that meets a test, I would just use break or return.

In Clojure, when I am processing a list using reduce to find that first value, won't it be inefficient if I continue and process the entire list?

For example: validating a list of dictionaries for errors; each dictionary has a key called count. Now the total sum of these count fields in the list should not exceed a certain value. How do I find the first item in the list where the sum exceeds the limit ?

Ideally, I would use reduce and maintain a running total; as soon as the total exceeds the limit, I would like to stop there (which I can't figure out how to do).

Also, the return value of the reduce will be the sum till now everytime but I would need to return the index at the end of all.

Upvotes: 9

Views: 3382

Answers (2)

Thumbnail
Thumbnail

Reputation: 13473

I'd be tempted to use reductions rather than reduce to find the break point. This is a lot slower than @Jonas's solution, but allows us to exercise some of the sequence library.

Given, for example

(def data [{:count 19}
           {:count 93}
           {:count 89}
           {:count 91}
           {:count 55}
           {:count 22}
           {:count 22}
           {:count 22}
           {:count 40}
           {:count 89}])

then

(reductions + (map :count data))
; (19 112 201 292 347 369 391 413 453 542)

... returns the sequence of accumulated :count values.

Let's suppose the limit is 300. Then

(let [acceptable (take-while
                   (partial >= 300)
                   (reductions + (map :count data)))]
  acceptable)
; (19 112 201 292)

... returns the initial sub-sequence that meets the limit criterion. This tells us both things we want to know:

  • Its last element is the accumulated :count.
  • Its length - returned, confusingly, by function count - is the position, counting from 0, of the first refused element.

You can use the length, for example, to split the original sequence into accepted and refused parts:

(let [acceptable (take-while ...)]
  (split-at (count acceptable) data))
; [({:count 19} {:count 93} {:count 89} {:count 91})
;  ({:count 55} {:count 22} {:count 22} {:count 22} {:count 40} {:count 89})]

If we know that the original sequence is a vector, we can use subvec to pull out its parts much quicker.

(let [acceptable (take-while ...)
      point (count acceptable)]
  [(subvec data 0 point) (subvec data point)])
; [[{:count 19} {:count 93} {:count 89} {:count 91}]
;  [{:count 55} {:count 22} {:count 22} {:count 22} {:count 40} {:count 89}]]

Upvotes: 2

Jonas
Jonas

Reputation: 19642

You can use the reduced function to terminate a reduction:

(reduce (fn [sum x] 
          (if (> sum 10) 
            (reduced 10) 
            (+ sum x))) 
        0 
        [1 2 3 4 5 6 7 8 9 10])

Upvotes: 19

Related Questions