Reputation: 306
I have a dataframe,
df = pd.DataFrame({"X1": ["A", "B", "A", "B", "B","C","C","C"],
"X2": ['FOO','BAR' ,'FOO1', 'BAR1', 'FOO2','BAR2','FOO3','BAR3']})
X1 X2
0 A FOO
1 B BAR
2 A FOO1
3 B BAR1
4 B FOO2
5 C BAR2
6 C FOO3
7 C BAR3
Now I am doing the value counts which give A:2, B:3, C:3, and I want to extract the rows according to counts of A. So that, I can have a dataframe in which 2 rows of A, 2 rows of B and 2 rows of C.
So output should be,
X1 X2
0 A FOO
2 A FOO1
1 B BAR
3 B BAR1
5 C BAR2
6 C FOO3
Upvotes: 3
Views: 67
Reputation: 862921
Use GroupBy.head
with count A
values by sum
compared values by Series.eq
for ==
with sorting by column X1
:
N = df['X1'].eq('A').sum()
df = df.sort_values('X1').groupby('X1').head(N)
print (df)
X1 X2
0 A FOO
2 A FOO1
1 B BAR
3 B BAR1
5 C BAR2
6 C FOO3
Upvotes: 7