cookieisaac
cookieisaac

Reputation: 1407

How to generate a stream of *unique* random numbers in Go using the standard library

How can I generate a stream of unique random number in Go?

I want to guarantee there are no duplicate values in array a using math/rand and/or standard Go library utilities.

func RandomNumberGenerator() *rand.Rand {
    s1 := rand.NewSource(time.Now().UnixNano())
    r1 := rand.New(s1)          
    return r1
}
rng := RandomNumberGenerator()    
N := 10000
for i := 0; i < N; i++ {
    a[i] = rng.Int()
}

There are questions and solutions on how to generate a series of random number in Go, for example, here.

But I would like to generate a series of random numbers that does not duplicate previous values. Is there a standard/recommended way to achieve this in Go?

My guess is to (1) use permutation or to (2) keep track of previously generated numbers and regenerate a value if it's been generated before.

But solution (1) sounds like overkill if I only want a few number and (2) sounds very time consuming if I end up generating a long series of random numbers due to collision, and I guess it's also very memory-consuming.


Use Case: To benchmark a Go program with 10K, 100K, 1M pseudo-random number that has no duplicates.

Upvotes: 13

Views: 25705

Answers (6)

Martin
Martin

Reputation: 4222

I had similar task to pick elements from initial slice by random uniq index. So from slice with 10k elements get 1k random uniq elements.

Here is simple head on solution:

import (
    "time"
    "math/rand"
)

func getRandomElements(array []string) []string {
    result := make([]string, 0)
    existingIndexes := make(map[int]struct{}, 0)
    randomElementsCount := 1000

    for i := 0; i < randomElementsCount; i++ {
        randomIndex := randomIndex(len(array), existingIndexes)
        result = append(result, array[randomIndex])
    }

    return result
}

func randomIndex(size int, existingIndexes map[int]struct{}) int {
    rand.Seed(time.Now().UnixNano())

    for {
        randomIndex := rand.Intn(size)

        _, exists := existingIndexes[randomIndex]
        if !exists {
            existingIndexes[randomIndex] = struct{}{}
            return randomIndex
        }
    }
}

Upvotes: 2

ben
ben

Reputation: 818

you can generate a unique random number with len(12) using UnixNano in golang time package :

uniqueNumber:=time.Now().UnixNano()/(1<<22)
println(uniqueNumber)

it's always random :D

Upvotes: 1

Schwern
Schwern

Reputation: 165004

I see two reasons for wanting this. You want to test a random number generator, or you want unique random numbers.

You're Testing A Random Number Generator

My first question is why? There's plenty of solid random number generators available. Don't write your own, it's basically dabbling in cryptography and that's never a good idea. Maybe you're testing a system that uses a random number generator to generate random output?

There's a problem: there's no guarantee random numbers are unique. They're random. There's always a possibility of collision. Testing that random output is unique is incorrect.

Instead, you want to test the results are distributed evenly. To do this I'll reference another answer about how to test a random number generator.

You Want Unique Random Numbers

From a practical perspective you don't need guaranteed uniqueness, but to make collisions so unlikely that it's not a concern. This is what UUIDs are for. They're 128 bit Universally Unique IDentifiers. There's a number of ways to generate them for particular scenarios.

UUIDv4 is basically just a 122 bit random number which has some ungodly small chance of a collision. Let's approximate it.

n = how many random numbers you'll generate
M = size of the keyspace (2^122 for a 122 bit random number)
P = probability of collision

P = n^2/2M

Solving for n...

n = sqrt(2MP)

Setting P to something absurd like 1e-12 (one in a trillion), we find you can generate about 3.2 trillion UUIDv4s with a 1 in a trillion chance of collision. You're 1000 times more likely to win the lottery than have a collision in 3.2 trillion UUIDv4s. I think that's acceptable.

Here's a UUIDv4 library in Go to use and a demonstration of generating 1 million unique random 128 bit values.

package main

import (
    "fmt"
    "github.com/frankenbeanies/uuid4"
)

func main() {
    for i := 0; i <= 1000000; i++ {
        uuid := uuid4.New().Bytes()

        // use the uuid
    }
}

Upvotes: 1

joshlf
joshlf

Reputation: 23577

You should absolutely go with approach 2. Let's assume you're running on a 64-bit machine, and thus generating 63-bit integers (64 bits, but rand.Int never returns negative numbers). Even if you generate 4 billion numbers, there's still only a 1 in 4 billion chance that any given number will be a duplicate. Thus, you'll almost never have to regenerate, and almost never never have to regenerate twice.

Try, for example:

type UniqueRand struct {
    generated map[int]bool
}

func (u *UniqueRand) Int() int {
    for {
        i := rand.Int()
        if !u.generated[i] {
            u.generated[i] = true
            return i
        }
    }
}

Upvotes: 5

cookieisaac
cookieisaac

Reputation: 1407

Temporary workaround based on @joshlf's answer

type UniqueRand struct {
    generated   map[int]bool    //keeps track of
    rng         *rand.Rand      //underlying random number generator
    scope       int             //scope of number to be generated
}

//Generating unique rand less than N
//If N is less or equal to 0, the scope will be unlimited
//If N is greater than 0, it will generate (-scope, +scope)
//If no more unique number can be generated, it will return -1 forwards
func NewUniqueRand(N int) *UniqueRand{
    s1 := rand.NewSource(time.Now().UnixNano())
    r1 := rand.New(s1)
    return &UniqueRand{
        generated: map[int]bool{},
        rng:        r1,
        scope:      N,
    }
}

func (u *UniqueRand) Int() int {
    if u.scope > 0 && len(u.generated) >= u.scope {
        return -1
    }
    for {
        var i int
        if u.scope > 0 {
            i = u.rng.Int() % u.scope
        }else{
            i = u.rng.Int()
        }
        if !u.generated[i] {
            u.generated[i] = true
            return i
        }
    }
}

Client side code

func TestSetGet2(t *testing.T) {
    const N = 10000
    for _, mask := range []int{0, -1, 0x555555, 0xaaaaaa, 0x333333, 0xcccccc, 0x314159} {
        rng := NewUniqueRand(2*N)
        a := make([]int, N)
        for i := 0; i < N; i++ {
            a[i] = (rng.Int() ^ mask) << 1
        }

        //Benchmark Code
    }
}

Upvotes: 0

user6169399
user6169399

Reputation:

1- Fast positive and negative int32 unique pseudo random numbers in 296ms using std lib:

package main

import (
    "fmt"
    "math/rand"
    "time"
)

func main() {
    const n = 1000000
    rand.Seed(time.Now().UTC().UnixNano())
    duplicate := 0
    mp := make(map[int32]struct{}, n)
    var r int32
    t := time.Now()
    for i := 0; i < n; {
        r = rand.Int31()
        if i&1 == 0 {
            r = -r
        }
        if _, ok := mp[r]; ok {
            duplicate++
        } else {
            mp[r] = zero
            i++
        }
    }
    fmt.Println(time.Since(t))
    fmt.Println("len: ", len(mp))
    fmt.Println("duplicate: ", duplicate)
    positive := 0
    for k := range mp {
        if k > 0 {
            positive++
        }
    }
    fmt.Println(`n=`, n, `positive=`, positive)
}

var zero = struct{}{}

output:

296.0169ms
len:  1000000
duplicate:  118
n= 1000000 positive= 500000

2- Just fill the map[int32]struct{}:

for i := int32(0); i < n; i++ {
        m[i] = zero
}

When reading it is not in order in Go:

for k := range m {
    fmt.Print(k, " ")
}

And this just takes 183ms for 1000000 unique numbers, no duplicate (The Go Playground):

package main

import (
    "fmt"
    "time"
)

func main() {
    const n = 1000000
    m := make(map[int32]struct{}, n)
    t := time.Now()
    for i := int32(0); i < n; i++ {
        m[i] = zero
    }
    fmt.Println(time.Since(t))
    fmt.Println("len: ", len(m))
    //  for k := range m {
    //      fmt.Print(k, " ")
    //  }
}

var zero = struct{}{}

3- Here is the simple but slow (this takes 22s for 200000 unique numbers), so you may generate and save it to a file once:

package main

import "time"
import "fmt"
import "math/rand"

func main() {
    dup := 0
    t := time.Now()
    const n = 200000
    rand.Seed(time.Now().UTC().UnixNano())
    var a [n]int32
    var exist bool
    for i := 0; i < n; {
        r := rand.Int31()
        exist = false
        for j := 0; j < i; j++ {
            if a[j] == r {
                dup++
                fmt.Println(dup)
                exist = true
                break
            }
        }
        if !exist {
            a[i] = r
            i++
        }
    }
    fmt.Println(time.Since(t))
}

Upvotes: 0

Related Questions