Use of 'random_state' parameter in sklearn.utils.shuffle?

Question

what is the random_state parameter in shuffle in sklearn.utils? any one can explain random_state with some sample?

Abhinav Arora · Accepted Answer

The shuffle is used to shuffle your matrices randomly. Programmatically, random sequences are generated using a seed number. You are guaranteed to have the same random sequence if you use the same seed. The random_state parameter allows you to provide this random seed to sklearn methods. This is useful because it allows you to reproduce the randomness for your development and testing purposes. So, in the shuffle method, if I use the same random_state with the same dataset, then I am always guaranteed to have the same shuffle. Consider the following example:

X = np.array([[1., 0.], [2., 1.], [0., 0.]])
X = shuffle(X, random_state=20)

If this gives me the following output,

array([[ 0.,  0.],
      [ 2.,  1.],
      [ 1.,  0.]])

Now, I am always guaranteed that if I use the random_state = 20, I will always get exactly the same shuffling. This si particularly useful for unit tests, where you would like to have reproducible results for asserting your conditions being tested.

Hope that helps!

Use of 'random_state' parameter in sklearn.utils.shuffle?

Answers (2)

Related Questions

Use of &#39;random_state&#39; parameter in sklearn.utils.shuffle?

Answers (2)

Related Questions

Use of 'random_state' parameter in sklearn.utils.shuffle?