matanox
matanox

Reputation: 13686

can you explain pmap laziness and memory footprint?

The docs says about pmap:

Like map, except f is applied in parallel. Semi-lazy in that the parallel computation stays ahead of the consumption, but doesn't realize the entire result unless required.

Can you kindly dis-obfuscate these two statements in some simple context? Also is there for the pmap function, a doseq equivalent, having a memory footprint constant relative to the size of the iterated collection?

Upvotes: 0

Views: 282

Answers (2)

Leon Barrett
Leon Barrett

Reputation: 121

While Taylor's answer is correct, I also gave a presentation on what happens inside of pmap, and how it's lazy, at Clojure West a few years ago. I know not everyone likes videos for learning, but if you do, it might be helpful: https://youtu.be/BzKjIk0vgzE?t=11m48s

(If you want non-lazy pmap, I second the endorsement for Claypoole.)

Upvotes: 1

Taylor Wood
Taylor Wood

Reputation: 16194

Semi-lazy in that the parallel computation stays ahead of the consumption

This means that pmap will do slightly more work than is strictly required by the sequence's consumer. This "working ahead" minimizes the wait for more items to be computed when the sequence is consumed. For example, if you're computing some infinite sequence in parallel and you only consume the first 50 results, pmap may have gone ahead and computed 50+N.

but doesn't realize the entire result unless required.

This means it's only going to work ahead up to a certain threshold. The entire sequence won't be produced unless it's completely consumed (or almost completely consumed).

Also is there for the pmap function, a doseq equivalent

You can use doall or dorun with pmap to produce side effects in parallel.

Here's an example of all three together, using an infinite sequence as input to pmap:

(def calls (atom 0))
(dorun (take 50 (pmap (fn [_] (swap! calls inc)) (range))))
;; @calls => 60

When this completes the value of calls will be over 50, even though we only consumed 50 items from the sequence.

Also read up on reducers and core.async for another way to do the same thing.

Upvotes: 2

Related Questions