Reputation: 113
I am totally new in this area. I tried to find the solutions but couldn't get exactly like this. I am doing the code in python Jupyter using pandas library.
I know the code for sampling. Which is df = data.sample(frac =.1)
But can't understand how to write the code for this.
Dataset:
I have this dataset. I want to choose 2 rows randomly from each class(Rings). Following is the expected output:
Upvotes: 1
Views: 1513
Reputation: 61910
You could do the following:
Setup
import numpy as np
import pandas as pd
np.random.seed(42)
df = pd.DataFrame({"Shell(g)": np.random.random(14), "Rings": [3, 3, 4, 4, 4, 4, 5, 5, 5, 5, 6, 6, 6, 6]})
Code
# shuffle
result = df.sample(frac=1.0)
# get the first two by group
result = result.groupby("Rings").head(2)
# sort by Rings
result = result.sort_values("Rings")
print(result)
Output
Shell(g) Rings
1 0.950714 3
0 0.374540 3
3 0.598658 4
2 0.731994 4
7 0.866176 5
6 0.058084 5
12 0.832443 6
10 0.020584 6
Upvotes: 2