Reputation: 33
I'm new to Clojure and I'll try my best to explain the question. I have a large hash-map that I'm splitting up into partitions in order to run a page rank operation in parallel. I want to use pmap
, but I'm not understanding the syntax required. The calc-page-rank
function takes four parameters. The collection that I want to use pmap
on is mapSegments
(a sequence of n maps), so it goes one by one through the sequence of maps and runs calc-pank-rank
. I've used map
with a single parameter, but not multiple. My current syntax does not work. I'm not sure if what I'm trying to do is even possible.
The combine-maps
function just converts a map sequence to one map. Example of mapSegments
: ({0 [2 3 5 7 9], 1 [3 2 4 5], 2 [1]} {3 [0], 4 [1 5 8 9], 5 [1 0 6 9]} {6 [1 2 3], 7 [2 1 0], 8 [9 10 1]} {9 [2 3 1 8 7], 10 [1]})
My code:
(defn calc-page-rank [myMap inpagesMap outpagesCountMap pageRankMap]
(def d 0.85) ;; damping factor
(def p (- 1 d)) ;; probability of giving up
(combine-maps (for [[pageID inpages] myMap]
(into {} {pageID (+ p (* d (reduce + (for [i inpages]
(/ (get pageRankMap i) (get outpagesCountMap i))))))}))))
(defn main-body [myMap]
(def inpagesMap (find-inpages myMap))
(def outpagesCountMap (count-outpages myMap))
(def mapSegments (map vec-to-map (split-map 4 myMap))) ;; returns a sequence of 4 hash maps
; this is the part I'm confused with
(print (combine-maps (pmap calc-page-rank mapSegments inpagesMap outpagesCountMap (start-rank myMap 1)))) ;; for each segment of the map, run calc-page-rank
; ....
(shutdown-agents)
Upvotes: 0
Views: 113
Reputation: 1395
If I am understanding the question, the first two calls to calc-page-rank would look like:
(calc-page-rank (first mapSegments) inpagesMap outpagesCountMap (start-rank myMap 1))
(calc-page-rank (second mapSegments) inpagesMap outpagesCountMap (start-rank myMap 1))
If this is correct and the rest of the parameters do not change from one call to the next, you can create a new function that takes one parameter.
(fn [segment] (calc-page-rank segment inpagesMap outpagesCountMap (start-rank myMap 1)))
This can be simplified by using a shortcut form of the anonymous function by putting a # in front of the calc-page-rank call and using % for the parameter value.
(mapv #(calc-page-rank % inpagesMap outpagesCountMap (start-rank myMap 1)) mapSegments)
Upvotes: 0