Creating fake data in Python

Question

I am trying to create a function that creates fake data to use in a separate analysis. Here are the requirements for the function.

Problem 1

In this problem you will create fake data using numpy. In the cell below the function create_data takes in 2 parameters "n" and "rand_gen.

The "rand_gen" parameter is a pseudo-random number generator. We are using a pseudo-random number generator to produce the same results.
Use the numpy.random.randn function of the pseudo-random generator to create a numpy array of length n and return the array.

Here is the function I have created.

def create_data(n, rand_gen):
'''
Creates a numpy array with n samples from the standard normal distribution

Parameters
-----------
n : integer for the number of samples to create
rand_gen : pseudo-random number generator from numpy  

Returns
-------
numpy array from the standard normal distribution of size n
'''

numpy_array = np.random.randn(n)
return numpy_array

Here is the first test I run on my function.

create_data(10, np.random.RandomState(seed=23))

I need the output to be this exact array.

[0.66698806, 0.02581308, -0.77761941, 0.94863382, 0.70167179,
                       -1.05108156, -0.36754812, -1.13745969, -1.32214752,  1.77225828]

My output is still completely random and I do not fully understand what the RandomState call is trying to do with the seed to create the above array rather than have it be completely random. I know I need to use the rand_gen variable in my function, but I do not know where and I think it's because I just don't understand what it is trying to do.

Some_acctg_guy · Accepted Answer

Define numpy_array = rand_gen.randn(n)

Creating fake data in Python

Answers (2)

Related Questions