Dr Rama Lakshmanan
Dr Rama Lakshmanan

Reputation: 67

How to generate random numbers in [0 ... 1.0] in Common Lisp

My understanding of Common Lisp pseudorandom number generation is that (random 1.0) will generate a fraction strictly less than 1. I would like to get numbers upto 1.0 inclusive. Is this possible? I guess I could decide on a degree of precision and generate integers and divide by the range but I'd like to know if there is a more widely accepted way of doing this. Thanks.

Upvotes: 2

Views: 1003

Answers (2)

Kaz
Kaz

Reputation: 58568

I have a bit of a different idea here. Instead of trying to stretch the range over an epsilon, we can work with the original range, and pick a victim number somewhere in that range which gets mapped to the range limit. We can avoid a hard-coded victim by choosing one randomly, and changing it from time to time:

(defun make-random-gen (range)
  (let ((victim nil)
        (count 1))
    (lambda ()
      (when (zerop (decf count))
        (setf count 10000
              victim (random range)))
      (let ((out (random range)))
        (if (eql out victim) range out)))))


(defun testit ()
  (loop with r = (make-random-gen 1.0)
        for x = (funcall r)
        until (eql x 1.0)
        counting t))

At the listener:

[5]> (testit) 
23030093

There is a small bias here in that the victim is never equal to range. So that is to say, the range value such as 1.0 is never victim and therefore always has a certain chance of occurring. Whereas every other value can potentially take a turn at being victim, having its chance of occurring temporarily reduced to zero. That should be faintly detectable in a statistical analysis of the output in that the range value will occur slightly more often than any other value.

It would be interesting to update this approach with a correction for that, an attempt to do which is this:

(defun make-random-gen (range)
  (let ((victim nil)
        (count 1))
    (labels ((gen ()
               (when (zerop (decf count))
                 (setf count 10000
                       victim (gen)))
               (let ((out (random range)))
                 (if (eql out victim) range out))))
      #'gen)))

Now when we select victim, we recurse on our own function which can potentially select range. Whenever range is selected as victim, that value is correctly suppressed: range will not occur in the output, because out will never be eql to range.

We can justify this with the following hand-waving argument:

Let us suppose that the recursive call to gen has a slight bias in favor of range being output. But whenever that happens, range is selected as victim, which prevents it from appearing in the output of gen.

There is a kind of negative feedback which should almost entirely correct the bias.


Note: our random-number-generating lambda would be better designed if it also captured a random state object also and used that. Then the sequence it yields would be undisturbed by other uses of the pseudo-random-number generator. That's a different topic.


On a theoretical note, note that neither [0, 1) nor [0, 1] yield strictly correct distributions. If we had a mathematically ideal PRNG, it would yield actual real numbers in these ranges. Since that range contains an uncountable infinity of real values, each one would occur with a zero probability: 1/aleph-null, which, I'm guessing, so tiny, that it cannot be distinguished from a real zero.

What we want is the floating-point PRNG to approximate the ideal PRNG.

The problem is that each floating-point value approximates a range of real values. So this means that if we have a generator of values in the range 0.0 to 1.0, it actually represents a range of real numbers from -epsilon to 1.0 + epsilon. If we take values from this PRNG and plot a bar graph of values, each bar in the graph has to have some nonzero width. The 0.0 bar is centered on 0, and the 1.0 bar is centered on 1. The distribution of real numbers extends from the left edge of the left bar, to the right edge of the right bar.

In order to create a PRNG which mimics an even distribution of values in the 0.0 to 1.0 interval, we have to include the 0.0 and 1.0 values with half probability. So that is to say, when we collect a large number of values from the PRNG, the 0.0 and 1.0 bars of the graph should be about half as high as all the other bars.

Under these conditions, we cannot distinguish the [0, 1.0) interval from the [0, 1.0] interval because they are exactly as large. We must include the 1.0 value, at about half the usual probability to account for the above uniformity problem. If we simply exclude that value, we create a bias in the wrong direction, because the 1.0 bar in the histogram now has a zero value.

One way we could rescue the situation might be to take the 1.0-epsilon bar of the histogram and make that value 50% more likely, so that the bar is 50% taller than average. Basically, we overload that last value of the range just before 1.0 to represent everything up to and not including 1.0, requiring that value to be more likely. And then, we exclude the 1.0 value from the output. All values approaching 1.0 from the left get mapped to the extra 50% probability of 1.0 - epsilon.

Upvotes: 0

ignis volens
ignis volens

Reputation: 9252

As you say, random will generate numbers in [0,1) by default, and in general (random x) will generate random numbers in [0,x). If these were real numbers and if the distribution really is random, then the probability of getting any number is zero, so this is effectively no different than [0,1]. But they're not real numbers: they're floats, so the probability of getting any particular value is higher since there are only a finite number of floats in [0,1].

Fortunately you can express exactly what you want: CL has a bunch of constants with names like *-epsilon which are defined so that, for instance

(/= (+ 1.0f0 single-float-epsilon) 1.0f0)

and single-float-epsilon is the smallest single-float for which this is true.

Thus (random (+ 1.0f0 single-float-epsilon)) will produce random single-floats in the range [0,1], and will eventually probably turn out 1.0f0. You can test this:

(defun tsit ()
  (let ((f (+ 1.0f0 single-float-epsilon)))
    (assert (/= f 1.0f0) (f) "oops")
    (loop for i upfrom 1
          for v = (random f)
          when (= v 1.0f0)
          return (values i v))))

And for me

> (tsit)
12839205
1.0

If you use double floats it takes ... quite a lot longer ... to get 1.0d0 (and remember to use double-float-epsilon).

Upvotes: 7

Related Questions