Reputation: 31
Is there anyway to pick random data from a dataframe? df.sample() used to pick a random row/columns. I want to create a dataframe consists of random sample from a dataframe but not taking it by row/column
example
col 0 | col 1 | col 2 |
---|---|---|
1 | 5 | 3 |
2 | 6 | 2 |
3 | 7 | 4 |
4 | 8 | 9 |
i want to turn it into
col 0 | col 1 | col 2 |
---|---|---|
9 | 7 | 1 |
4 | 3 | 2 |
3 | 2 | 8 |
5 | 6 | 4 |
you put a random data to a new dataframe from a random range in a dataframe
Upvotes: 0
Views: 191
Reputation: 195593
You can use simple np.random.shuffle()
:
np.random.shuffle(df.values)
print(df)
Prints (for example):
col 0 col 1 col 2
0 3 7 4
1 1 5 3
2 2 6 2
3 4 8 9
Upvotes: 0
Reputation: 18315
Going to numpy, shuffling there, putting it back to a dataframe:
import numpy as np
vals = df.to_numpy().ravel()
np.random.shuffle(vals)
new = pd.DataFrame(vals.reshape(df.shape), index=df.index, columns=df.columns)
to get
>>> new
col 0 col 1 col 2
0 8 1 7
1 3 2 6
2 5 9 4
3 4 3 2
Upvotes: 1