Priyatham
Priyatham

Reputation: 2897

How to generate large number of gaussian random variables in Swift

I want to create a large array of random numbers drawn from the gaussian distribution. I found dlarnv, but I am not sure how to use it in Swift. Specifically, the type signature XCode shows is as follows:

dlarnv_(
__idist: UnsafeMutablePointer<__CLPK_integer>,
__iseed: UnsafeMutablePointer<__CLPK_integer>,
__n: UnsafeMutablePointer<__CLPK_integer>,
__x: UnsafeMutablePointer<__CLPK_doublereal>
)

How do I use this to fill an array with single precision floating point numbers? This is how far I have gotten:

n = 10000
var data: [Float]
data.reserveCapacity(n)

data
dlarnv_(
3, // for normal distribution
seed, // not sure how to seed
n,
data, // not sure how to pass a pointer
)

Upvotes: 0

Views: 646

Answers (3)

Rob Napier
Rob Napier

Reputation: 299613

If you want a lot of values at one time, and you want them in a standard normal distribution (µ=0, σ=1), it's very hard to beat dlarnv_. But if you want a bit more flexibility, you should also consider GKLinearCongruentialRandomSource, and a little math. The following is about 20% slower than dlarnv_ for fetching 10M values all at once, but it's 5x faster if you want one value at a time (including adjusting the mean and stddev).

import GameplayKit
let random = GKLinearCongruentialRandomSource()

func randomNormalValue(average: Double, standardDeviation: Double) -> Double {
    let x1 = Double(random.nextUniform())
    let x2 = Double(random.nextUniform())
    let z1 = sqrt(-2 * log(x1)) * cos(2 * .pi * x2)

    return z1 * standardDeviation + average
}

I'm not sure why GKGaussianDistribution isn't implemented this way (or one of the other solutions that are possibly even faster, but I haven't bothered to implement and test). I agree that it's slow.

But it's not that slow. It's about 75% slower than dlarnv_ for 10M random values. Which is slow, but still in the same order-of-magnitude. The issue is the random source itself. Most folks write this:

let random = GKRandomSource()

And that's definitely the safest answer. But that's a cryptographically-secure source of entropy. If you're doing anything where the number needs to be really random, that's what you should be using (and dlarnv_ doesn't do this, so it's unsafe in some cases).

But if you just need "kinda random, but no one is going to try to exploit the fact that it's not really random," the source you want is GKLinearCongruentialRandomSource. And that gets GKGaussianDistribution into the same ballpark as Accelerate (factor of 2, not orders of magnitude).


Update

Based on the comment from @pjs and their answer at Implementing the PRNG xoshiro256+ in Swift for a RN in a given range?, an even faster solution for generating many individual non-cryptographic random values with different distributions is xoshiro256+.

struct XoshiroRandomNormalGenerator {
    // Based on xoshiro256+ 1.0 https://prng.di.unimi.it/xoshiro256plus.c
    // Also based on https://stackoverflow.com/questions/50559229/implementing-the-prng-xoshiro256-in-swift-for-a-rn-in-a-given-range

    func rotl(_ x: UInt64, _ k: Int) -> UInt64 {
        return (x << k) | (x >> (64 - k))
    } // This is the rotating function.

    var s0, s1, s2, s3: UInt64

    var spareNormal: Double?

    init(seed: [UInt64]? = nil) {
        if let seed {
            s0 = seed[0]
            s1 = seed[1]
            s2 = seed[2]
            s3 = seed[3]
        } else {
            s0 = .random(in: (.min)...(.max))
            s1 = .random(in: (.min)...(.max))
            s2 = .random(in: (.min)...(.max))
            s3 = .random(in: (.min)...(.max))
        }
    }

    mutating func nextUniform() -> UInt64 {
        let result_plus = s0 &+ s3

        let t = s1 << 17

        s2 ^= s0
        s3 ^= s1
        s1 ^= s2
        s0 ^= s3

        s2 ^= t

        s3 = rotl(s3, 45)

        return result_plus

    } // This returns the next number in the algorithm while XORing the seed vectors for use in the next call.

    mutating func nextDouble() -> Double {
        // Double has 52 significand digits
        // Shift right 64-52=12, multiply by 2^-52
        Double(nextUniform() >> 12) * 0x1p-52
    }

    mutating func nextNormal(average: Double, standardDeviation: Double) -> Double {
        if let spareNormal {
            self.spareNormal = nil
            return spareNormal * standardDeviation + average
        }

        // Box-Muller
        let x1 = nextDouble()
        let x2 = nextDouble()
        let mag = sqrt(-2 * log(x1))
        let z1 = mag * cos(2 * .pi * x2)
        spareNormal = mag * sin(2 * .pi * x2)

        return z1 * standardDeviation + average
    }
}

This can be used as:

var random = XoshiroRandomNormalGenerator()
let nextValue = random.nextNormal(average: avg, standardDeviation: stddev)

Upvotes: 2

Flex Monkey
Flex Monkey

Reputation: 3643

You might want to consider the random generator in Accelerate's BNNS library:

func randomFloats(n: Int,
                  mean: Float,
                  standardDeviation: Float) -> [Float] {
    
    let result = [Float](unsafeUninitializedCapacity: n) {
        
        buffer, unsafeUninitializedCapacity in
        
        guard
            var arrayDescriptor = BNNSNDArrayDescriptor(
                data: buffer,
                shape: .vector(n)),
            let randomNumberGenerator = BNNSCreateRandomGenerator(
                BNNSRandomGeneratorMethodAES_CTR,
                nil) else {
            fatalError()
        }
        
        BNNSRandomFillNormalFloat(
            randomNumberGenerator,
            &arrayDescriptor,
            mean,
            standardDeviation)
        
        unsafeUninitializedCapacity = n
        BNNSDestroyRandomGenerator(randomNumberGenerator)
    }
    return result
}

Upvotes: 1

Priyatham
Priyatham

Reputation: 2897

Using dlarnv_ is much faster than using GameplayKit

var n: Int32 = 500 // Array size
var d: Int32 = 3 // 3 for Normal(0, 1)
var seed: [Int32] = [1, 1, 1, 1] \\ Ideally pick a random seed
var x: [Double] = Array<Double>(unsafeUninitializedCapacity: Int(n)) { buffer, count in
    dlarnv_(&d, &seed, &n, buffer.baseAddress)
    count = Int(n)
}

Upvotes: 0

Related Questions