jayjay
jayjay

Reputation: 107

Distribution of outcomes in dice experiments

So I wrote a short Python function to plot distribution outcome of dice experiments. It's working fine but when I run for example dice(1,5000) or dice(10,5000) or dice(100,5000) the histograms shows a skewed distribution (high preference for 6). However, the average shows the to-be expected value of around 3.5. I thought maybe this has sth to do with the random number generation so I tried out 2 methods: 1st with random.randint and the 2nd one is as in code. However, they deliver similar results. Like there is something wrong with the upper limit. But I'm not sure why there is such a skewed distribution.

import matplotlib.pyplot as plt
import numpy as np
import random

# Throw a dice
def dice(N,n):
    result = np.zeros((n,N))
    '''
    N: number of dices
    n: number of experiment
    '''
    for i in range(n):
        for j in range(N):
            random_number = random.random()
            outcome = int(random_number * 6 + 1)
            result[i][j]=outcome
    laverage = np.mean(result)

    print('Result of throwing %d dice(s) for %d times:'%(N,n),result)
    print(laverage)
    plt.hist(np.resize(result,(N*n,1)),bins=[x for x in range(1,7)])
    plt.xlabel('Outcome')
    plt.ylabel('Number of occurences')
    plt.show()

dice(1,5000)

Upvotes: 3

Views: 1245

Answers (3)

Mad Physicist
Mad Physicist

Reputation: 114478

According to a sample of your code, the issue is a plotting problem, not a computational one, which is why you are seeing the correct mean. As you can see, the following image shows five bars, the last one being twice the size of the others:

pic

Notice also that the bars are labeled on the left, and there is therefore no "6" bar. This has to do with what plt.hist means by bins:

If bins is a sequence, it defines the bin edges, including the left edge of the first bin and the right edge of the last bin; in this case, bins may be unequally spaced. All but the last (righthand-most) bin is half-open.

So to specify bin edges, you probably want something more like

plt.hist(np.ravel(result), bins=np.arange(0.5, 7.5, 1))

And the result:

enter image description here

Unasked Questions

If you want to simulate N * n data points, you can use numpy directly. Replace your original initialization of result and the for loop with any of the following lines:

result = (np.random.uniform(size=(n, N)) * 6 + 1).astype(int)
result = np.random.uniform(1.0. 7.0, size=(n, N)).astype(int)
result = np.random.randint(1, 7, size=(n, N))

The last line is preferable in terms of efficiency and accuracy.

Another possible improvement is in how you compute the histogram. Right now, you are using plt.hist, which calls np.histogram and plt.bar. For small integers like you have, np.bincount is arguably a much better binning technique:

count = np.bincount(result.ravel())[1:]
plt.bar(np.arange(1, 7), count)

Notice that this also simplifies the plotting since you specify the centers of the bars directly, instead of having plt.hist guess it for you.

Upvotes: 3

cglacet
cglacet

Reputation: 10962

If you are lazy (like me), you can also use numpy to directly generate a matrix and seaborn to deal with bins for you:

import numpy as np
import seaborn as sns

dices = 1000
throws = 5000
x = np.random.randint(6, size=(dices, throws)) + 1
sns.distplot(x)

Which gives:

enter image description here

Seaborn usually make good choices, which can save a bit of time in configuration. That's worth a try at least. You can also use the kde=False option on the seaborn plot to get rid of the density estimate.

Just for the sake of it and to show how seaborn behave, the same with the sum over 100 dices:

dices = 100
throws = 5000
x = np.random.randint(6, size=(dices, throws)) + 1
sns.distplot(x.sum(axis=0), kde=False)

enter image description here

Upvotes: 1

Sam
Sam

Reputation: 1415

Your plot is only showing 5 bars - the bar is to the right of the number, so I believe the results for 5 and 6 are being combined. If you change to range(1,8) you see more of what you expect.

enter image description here

Upvotes: 5

Related Questions