FrankBool
FrankBool

Reputation: 27

Pandas groupby, filter and put the output in a list

Hello guys I have a problem with this function that I want to implement inside my code. Assuming that I am working on this data frame.

df = pd.DataFrame([[100, 1],[100, 1],[200, 2],[200, 2],[200, 2]], columns=['a','b'])

Now I would like to count first the unique entries of column "a" and then filter select only those element in column "a" that are bigger than 3

group=df.groupby('a').count()
filter=group['b'].isin([3])

The output desired is a list that contain ONLY those element of the series "a" that satisfy the filter condition (named "filter"), so that from this new feature it is possible to filter back the initial filter so that i will keep only the rows 2,3,4 (counting from zero).

I hope my intent is clear, but of course in case I am approching the problem from the wrong point of view any suggestion is welcome.

Upvotes: 0

Views: 568

Answers (2)

Setop
Setop

Reputation: 2500

In [1]: import pandas as pd

In [2]: df = pd.DataFrame([[100, 1],[100, 1],[200, 2],[200, 2],[200, 2]], columns=['a','b'])

In [3]: pd.concat([i[1] for i in df.groupby('a') if len(i[1]) >2 ])
Out[3]: 
     a  b
2  200  2
3  200  2
4  200  2

Upvotes: 0

Scott Boston
Scott Boston

Reputation: 153500

IIUC, I don't think you have enough test data to test "bigger than 3",however you can test "bigger than 2".

df[df.groupby('a')['a'].transform('count').gt(2)]

Output:

     a  b
2  200  2
3  200  2
4  200  2

Upvotes: 1

Related Questions