Reputation: 12184
In the procedural world, if I have to find the first item of a list that meets a test, I would just use break
or return
.
In Clojure, when I am processing a list using reduce
to find that first value, won't it be inefficient if I continue and process the entire list?
For example: validating a list of dictionaries for errors; each dictionary has a key called count
. Now the total sum of these count fields in the list should not exceed a certain value. How do I find the first item in the list where the sum exceeds the limit ?
Ideally, I would use reduce
and maintain a running total; as soon as the total exceeds the limit, I would like to stop there (which I can't figure out how to do).
Also, the return value of the reduce will be the sum till now everytime but I would need to return the index at the end of all.
Upvotes: 9
Views: 3382
Reputation: 13473
I'd be tempted to use reductions
rather than reduce
to find the break point. This is a lot slower than @Jonas's solution, but allows us to exercise some of the sequence library.
Given, for example
(def data [{:count 19}
{:count 93}
{:count 89}
{:count 91}
{:count 55}
{:count 22}
{:count 22}
{:count 22}
{:count 40}
{:count 89}])
then
(reductions + (map :count data))
; (19 112 201 292 347 369 391 413 453 542)
... returns the sequence of accumulated :count
values.
Let's suppose the limit is 300
. Then
(let [acceptable (take-while
(partial >= 300)
(reductions + (map :count data)))]
acceptable)
; (19 112 201 292)
... returns the initial sub-sequence that meets the limit criterion. This tells us both things we want to know:
last
element is the accumulated :count
.count
- is the position, counting from 0, of the first refused element.You can use the length, for example, to split the original sequence into accepted and refused parts:
(let [acceptable (take-while ...)]
(split-at (count acceptable) data))
; [({:count 19} {:count 93} {:count 89} {:count 91})
; ({:count 55} {:count 22} {:count 22} {:count 22} {:count 40} {:count 89})]
If we know that the original sequence is a vector, we can use subvec
to pull out its parts much quicker.
(let [acceptable (take-while ...)
point (count acceptable)]
[(subvec data 0 point) (subvec data point)])
; [[{:count 19} {:count 93} {:count 89} {:count 91}]
; [{:count 55} {:count 22} {:count 22} {:count 22} {:count 40} {:count 89}]]
Upvotes: 2