ihadanny
ihadanny

Reputation: 4483

How to count unique records by two columns per group in pandas?

Same as How to count unique records by two columns in pandas?, only per group. I tried:

df = pd.DataFrame({'a': [1,1,1,2,2], 'b':[10,10,20,30,30], 'c':[5,7,7,11,17]})
df.groupby('a').groupby(['b', 'c']).ngroups

And it throws AttributeError.

Upvotes: 3

Views: 142

Answers (2)

user3483203
user3483203

Reputation: 51155

You don't need the double groupby: Use drop_duplicates with ['b', 'c'] as your subset, to keep only unique rows, then groupby 'a' and use size:

df.drop_duplicates(['b', 'c']).groupby('a').size()

a
1    3
2    2
dtype: int64

Upvotes: 6

DYZ
DYZ

Reputation: 57033

You need to apply a function to the results of first groupping:

df.groupby('a').apply(lambda x: x.groupby(['b', 'c']).ngroups)
#a
#1    3
#2    2

Upvotes: 3

Related Questions