Alessandrini
Alessandrini

Reputation: 191

Python Select N number of rows dataframe

I have a dataframe with 2 columns and I want to select N number of row from column B per column A

   A    B
   0    A
   0    B
   0    I
   0    D
   1    A
   1    F
   1    K
   1    L
   2    R

For each unique number in Column A give me N random rows from Column B: if N == 2 then the resulting dataframe would look like. If Column A doesn't have up to N rows then return all of column A

   A    B
   0    A
   0    D
   1    F
   1    K
   2    R

Upvotes: 0

Views: 432

Answers (1)

jezrael
jezrael

Reputation: 862611

Use DataFrame.sample per groups in GroupBy.apply with test length of groups with if-else:

N = 2
df1 = df.groupby('A').apply(lambda x: x.sample(N) if len(x) >=N else x).reset_index(drop=True)
print (df1)
   A  B
0  0  I
1  0  D
2  1  A
3  1  K
4  2  R

Or:

N = 2
df1 = df.groupby('A', group_keys=False).apply(lambda x: x.sample(N)  if len(x) >=N else x)
print (df1)
   A  B
0  0  A
3  0  D
5  1  F
6  1  K
8  2  R

Upvotes: 1

Related Questions