Reputation: 105
I am trying to create a function that creates fake data to use in a separate analysis. Here are the requirements for the function.
Problem 1
In this problem you will create fake data using numpy. In the cell below the function create_data takes in 2 parameters "n" and "rand_gen.
Here is the function I have created.
def create_data(n, rand_gen):
'''
Creates a numpy array with n samples from the standard normal distribution
Parameters
-----------
n : integer for the number of samples to create
rand_gen : pseudo-random number generator from numpy
Returns
-------
numpy array from the standard normal distribution of size n
'''
numpy_array = np.random.randn(n)
return numpy_array
Here is the first test I run on my function.
create_data(10, np.random.RandomState(seed=23))
I need the output to be this exact array.
[0.66698806, 0.02581308, -0.77761941, 0.94863382, 0.70167179,
-1.05108156, -0.36754812, -1.13745969, -1.32214752, 1.77225828]
My output is still completely random and I do not fully understand what the RandomState call is trying to do with the seed to create the above array rather than have it be completely random. I know I need to use the rand_gen variable in my function, but I do not know where and I think it's because I just don't understand what it is trying to do.
Upvotes: 1
Views: 1879
Reputation: 20530
I think the question you are asking is about pseudo-random numbers and reproducible randoms.
Real random numbers are made with real-word unpredictable data, like watching lava lamps, while pseudo-random numbers create a long sequence of numbers that appears random.
The basic algorithm is:
The trick is that specifying the same seed means you get the same sequence every time. You can set this with numpy.random.seed()
and then get the same sequence each time.
I hope this is the question you were asking.
Upvotes: 1