Reputation: 761
I have a dataframe with index index1
and values val1
and val2
. I'm trying to return the count of unique val1
values for each index1
.
DataFrame:
df = pd.DataFrame(columns=['index1', 'val1', 'val2'], data=[['A', 1, 1], ['A', 1, 1], ['A', 2, 1]])
df = df.set_index(['index1'])
I group like this
groupby = df.groupby([df.index, 'val1'])
Then, I call size(), which returns
index1 val1
A 1 2
2 1
dtype: int64
This returns the count for each group. I am looking for the number of groups each index1
value has. I.e A
has 2 unique groups.
Upvotes: 1
Views: 943
Reputation: 862511
I think you need SeriesGroupBy.nunique
if need count unique value in some column per groups:
df1 = df.groupby(level=0)['val1'].nunique()
print (df1)
index1
A 2
Name: val1, dtype: int64
df1 = df.groupby(level=0)['val1'].nunique().reset_index().rename(columns={'val1':'uniq'})
print (df1)
index1 uniq
0 A 2
And if need count unique values in all columns use agg
with nunique
:
df1 = df.groupby(level=0).agg(lambda x: x.nunique())
print (df1)
val1 val2
index1
A 2 1
Upvotes: 0
Reputation: 61947
If you want the total number of unique items for each column you can do the following
df.groupby(level=0).agg(lambda x: len(x.unique()))
val1 val2
index1
A 2 1
Upvotes: 1