DinoEntrails
DinoEntrails

Reputation: 31

Clojure: Running computations in several futures doesn't speed up my program

I am just beginning in Clojure and have written the following code for estimating pi via monte carlo simulation. I basically want to create X threads, each of which counts the number of random points that fall in the unit circle and returns it. My main thread then sums them all up and computes Pi.

However, running all the samples in the same thread is faster than splitting the computations among several threads via futures. Why?

(defn sumFutures [workerFutures acc i]
  (if (= -1 i)
    acc
    (recur workerFutures (+ acc @(nth workerFutures i)) (dec i))))

(defn getResults [workerFutures numSamples]
  (* 4.0 (/ (sumFutures workerFutures 0 (dec (count workerFutures))) numSamples)))

(defn isInCircle []
  (let [x (rand) y (rand)]
    (if (<= (+ (* x x) (* y y)) 1)
      1
      0)))

(defn countInCircle [remaining acc]
  (if (zero? remaining)
    acc
    (recur (dec remaining) (+ acc (isInCircle)))))

(defn getWorker [samplesPerWorker]
  (future
    (countInCircle samplesPerWorker 0)))

(defn addWorker [workers samplesPerWorker]
  (conj workers (getWorker samplesPerWorker)))

(defn getWorkers [workers samplesPerWorker remWorkers]
  (if (not (zero? remWorkers))
    (recur (addWorker workers samplesPerWorker) samplesPerWorker (dec remWorkers))
    (doall workers)))

(defn main [numSamples numWorkers]
  (getResults (getWorkers [] (quot numSamples numWorkers) numWorkers) numSamples))

;; Run all in 1 thread
(main 1000000 1)

;; Split among 100 futures (at least 8 threads)
;; SLOWER 
(main 1000000 100)

Several other notes based on debugging findings:

Upvotes: 3

Views: 130

Answers (2)

noisesmith
noisesmith

Reputation: 20194

I think this will be easier to work with if you have idiomatic code.

sumFutures is effectively reimplementing Clojure's definition of +, using recursion directly rather than Clojures optimized reduce implementations. Here's an alternate simpler (and likely faster) definition:

(defn sum-futures
  [workers]
  (apply + (map deref workers)))

getResults is now much easier to read - and I catch a place where rational division is happening - if we don't need rationals, making one operand a double will save a lot of work.

(defn get-results
  [workers num-samples]
  (* 4.0 (/ (sum-futures workers)
            (double num-samples))))

countInCircle can be represented more clearly using clojure's + as well.

(defn count-in-circle
  [n]
  (apply + (repeatedly n isInCircle)))

getWorkers once again does primitive recursive work instead of using Clojure's abstractions. If we use repeatedly, we can eliminate the addWorker and getWorker definitions, without reducing clarity, modularity, or efficiency (in fact in a case like this where you don't need indexed lookup, and the result will be consumed sequentially, the lazy-seq version should perform better than the vector. This is also now a cantidate for a refactoring along with sum-futures to a more efficient transducer based version).

(defn get-workers
  [num-workers samples-per-worker]
  (repeatedly num-workers
              #(future (count-in-circle samples-per-worker))

And finally main becomes:

(defn main
  [num-samples num-workers]
  (get-results (get-workers (quot num-samples num-workers)
                            num-workers)
               num-workers))

Upvotes: 4

Joanis
Joanis

Reputation: 1679

What happens if you drop the number of futures in your second run to 2 (and to 8)? Might be that the time put in managing the threads and futures is greater than the time saved by splitting the work.

It's generally counter-productive to spawn more threads than the hardware can support (as the threads will have to share cores and there's a cost to this). If you don't get a speed gain with only two threads (compared to 1), there's probably something wrong somewhere else.

Upvotes: 0

Related Questions