Reputation: 2152

Idiomatic Clojure for picking between random weighted choices

While dabbling in Clojure I completed a small example program to pick a random choice from a list of choices.

The basic idea is to iterate over the choices (which are assigned a weight) and turn their weights into a range, then pick a random number within the total range to select one. It might not be the most elegant design, but let's take it for granted.

What would wo do differently compared to my example below?

I'm not interested in overall program structure suggestions, name-spacing, etc, mostly in your approach to each function.

I'm especially interested in how a seasoned Clojurer would tackle the "augment" function, in which I had to use the external "cur" variable to refer to the previous end of the range.

  (def colors
      (hash-map 
            :white 1,
            :red 10,
            :blue 20,
            :green 1,
            :yellow 1
       )
     )

    (def color-list (vec colors))

    (def cur 0)

    (defn augment [x] 
      (def name (nth x 0))
      (def val (nth x 1))
      (def newval (+ cur val))
      (def left cur)
      (def right newval)
      (def cur (+ cur val))
      [name left right]
    )

    (def color-list-augmented (map augment color-list))

    (defn within-bounds [bound]
      (def min-bound (nth bound 1))
      (def max-bound (nth bound 2))
      (and (> choice min-bound) (< choice max-bound))
    )

    (def choice (rand-nth (range cur)))

    (def answer 
      (first (filter within-bounds color-list-augmented))
    )

    (println "Random choice:" (nth answer 0))

Upvotes: 13

Answers (4)

leeor

Reputation: 17801

The open source bigml sampling library is another option. I've used it with some success. It's much better documented and has a nice API.

Upvotes: 0

dribnet

Reputation: 2635

Rich Hickey's somewhat dated (2008) solution from ants.clj:

(defn wrand 
  "given a vector of slice sizes, returns the index of a slice given a
  random spin of a roulette wheel with compartments proportional to
  slices."
  [slices]
  (let [total (reduce + slices)
        r (rand total)]
    (loop [i 0 sum 0]
      (if (< r (+ (slices i) sum))
        i
        (recur (inc i) (+ (slices i) sum))))))

Stuart Halloway's more recent (2012) solution from data.generators:

(defn weighted
  "Given a map of generators and weights, return a value from one of
  the generators, selecting generator based on weights."
  [m]
  (let [weights   (reductions + (vals m))
        total   (last weights)
        choices (map vector (keys m) weights)]
    (let [choice (uniform 0 total)]
      (loop [[[c w] & more] choices]
        (when w
          (if (< choice w)
            (call-through c)
            (recur more)))))))

Upvotes: 9

A. Webb

Reputation: 26446

I suggest doing some problems at http://www.4clojure.com/ while learning Clojure. You can "follow" the top users and see how they solve problems.

Here's a solution. It is again not the most efficient as I was aiming to keep it simple and not use more advanced ideas and constructs that you'll learn later.

user=> (def colors {:white 1  :red 10  :blue 20  :green 1  :yellow 1})
#'user/colors
user=> (keys colors)
(:white :red :blue :green :yellow)   
user=> (vals colors)
(1 10 20 1 1)

To turning the weights into intervals, we just do a cumulative sum:

user=> (reductions #(+ % %2) (vals colors))
(1 11 31 32 33)

Finding the random interval:

user=> (rand-int (last *1))
13
user=> (count (take-while #(<= % *1 ) *2 ))
2

Note *1 in the REPL refers to the most recent value printed, *2 to the next most recent, etc. So we asked for a random integer between 0 (inclusive) and 33 (exclusive). These 33 possible choices correspond to the total of the weights. Next we counted the number of intervals we would need to pass through to find that number. Here the random number was 13.

(1 11 31 32 33) 
     ^ 13 belongs here, 2 numbers in

We find our random number 2 in. Note that in order to land here we had to have at least 11 but less than 31, so 20 possibilities, which is precisely the weight of...

user=> (nth (keys colors) *1)
:blue

So, putting this all together into a function:

(defn weighted-rand-choice [m]
    (let [w (reductions #(+ % %2) (vals m))
          r (rand-int (last w))]
         (nth (keys m) (count (take-while #( <= % r ) w)))))

Let's test it:

user=> (->> #(weighted-rand-choice colors) repeatedly (take 10000) frequencies)
{:red 3008, :blue 6131, :white 280, :yellow 282, :green 299}

Upvotes: 10

Arthur Ulfeldt

Reputation: 91577

It often helps to break the problem into layers that can be solved independently. Augment does this nicely in assigning ranges, though the result is harder to consume with the normal sequence functions while choosing one at random. If you change the goal of augment to have it produce a normal seq then the augmentation problem is more cleanly separated from choosing one randomly. Provided the weights are integers you can build a list containing weight number of each item, and then choose one at random:

user> (map (fn [[item weight]] (repeat weight item)) colors)
((:white) 
 (:red :red :red :red :red :red :red :red :red :red) 
 (:blue :blue :blue :blue :blue :blue :blue :blue :blue :blue
  :blue :blue :blue :blue :blue :blue :blue :blue :blue :blue) 
 (:green) (:yellow))

then flatten it down to a single list:

user> (flatten (map (fn [[item weight]] 
                       (repeat weight item)) 
                 colors))
(:white :red :red :red :red :red :red :red :red :red :red 
 :blue :blue :blue :blue :blue :blue :blue :blue :blue :blue 
 :blue :blue :blue :blue :blue :blue :blue :blue :blue :blue 
 :green :yellow)

and pick one with rand-nth:

user> (rand-nth (flatten (map (fn [[item weight]] (repeat weight item)) colors)))
:blue

ps: map literals make things look better: the reader page describes these nicely

(def colors {:white 1,
             :red 10,
             :blue 20,
             :green 1,
             :yellow 1})

use let to give things names within functions:

(defn augment [x]
  (let [name (nth x 0)
        val (nth x 1)
        newval (+ cur val)
        left cur
        right newval
        cur (+ cur val)]
    [name left right]))

Upvotes: 4

Idiomatic Clojure for picking between random weighted choices

Answers (4)

Related Questions