Reputation: 131088
I have a pandas data frame and group it by two columns (for example col1
and col2
). For fixed values of col1
and col2
(i.e. for a group) I can have several different values in the col3
. I would like to count the number of distinct values from the third columns.
For example, If I have this as my input:
1 1 1
1 1 1
1 1 2
1 2 3
1 2 3
1 2 3
2 1 1
2 1 2
2 1 3
2 2 3
2 2 3
2 2 3
I would like to have this table (data frame) as the output:
1 1 2
1 2 1
2 1 3
2 2 1
Upvotes: 24
Views: 47843
Reputation: 128948
In [17]: df
Out[17]:
0 1 2
0 1 1 1
1 1 1 1
2 1 1 2
3 1 2 3
4 1 2 3
5 1 2 3
6 2 1 1
7 2 1 2
8 2 1 3
9 2 2 3
10 2 2 3
11 2 2 3
In [19]: df.groupby([0,1])[2].apply(lambda x: len(x.unique()))
Out[19]:
0 1
1 1 2
2 1
2 1 3
2 1
dtype: int64
Upvotes: 21