Maku
Maku

Reputation: 47

Selecting reoccurring values

Dataframe:

Group   Name  Pop
A         F     5
A         C     4
A         D     4
B         E     6
B         L     4

I need a dataframe in which only data remains where there is at least three names in one group. So output:

Group  Name  Pop
A        F     5
A        C     4
A        D     4

I figured the easiest way would be to group by Group where group value count is three or more. I've tried different ways, always some errors.

df['Group'].apply(lambda x: x.value_counts()>2)  #for example this

Upvotes: 2

Views: 48

Answers (1)

EdChum
EdChum

Reputation: 394099

The groupby way to do this is groupby by 'Group' and then filter:

In [6]:

df.groupby('Group').filter(lambda x: x['Name'].count() > 2)
Out[6]:
  Group Name  Pop
0     A    F    5
1     A    C    4
2     A    D    4

The above doesn't discount duplicate names, if you want a count of unique names of 3 or more then you can filter using nunique:

In [7]:

df.groupby('Group').filter(lambda x: x['Name'].nunique() > 2)
Out[7]:
  Group Name  Pop
0     A    F    5
1     A    C    4
2     A    D    4

Upvotes: 1

Related Questions