Reputation: 1088
I'm playing around with matplotlib trying to learn its features but one problem I am struggling with is making it randomly produce data to test my graph. Can anyone tell me what I am doing incorrectly here?
import numpy as np
labels = numpy.random.random_integers(0, high=1, size=10000)
x = numpy.random.random_integers(1, high=10, size=10000)
y = numpy.random.random_integers(1, high=10, size=10000)
plt.ylabel("Y")
plt.xlabel("X")
plt.hist(x, label='1')
plt.hist(x[y==0], label='0')
plt.legend(loc='upper right')
plt.savefig('testRand.png')
Further to this, how can I distribute data within a range, for example if I want x to hold 10% 1's, 20% 2's, 70% 3's so this can be graphed and I can make my graphs look pretty/possibly hold meaningful distributions?
Thanks :)
Upvotes: 1
Views: 1037
Reputation: 46530
If you want to generate samples from meaningful distributions, many are supplied, for example:
x = np.random.exponential(2, 10000)
Many more are in scipy.stats
:
from scipy import stats
stats.gausshyper.rvs(a, b, c, z, size=10000)
To do something custom like what you want you can either create your own distribution with scipy.stats.rv_continuous
or rv_discrete
, where you can define whatever pdf or pmf you wish.
Or, a simpler hack for your example might be:
np.random.choice([1, 2, 2, 3, 3, 3, 3, 3, 3, 3], size=10000)
Upvotes: 2