Roman
Roman

Reputation: 131088

How to count distinct values in a column of a pandas group by object?

I have a pandas data frame and group it by two columns (for example col1 and col2). For fixed values of col1 and col2 (i.e. for a group) I can have several different values in the col3. I would like to count the number of distinct values from the third columns.

For example, If I have this as my input:

1  1  1
1  1  1
1  1  2
1  2  3
1  2  3
1  2  3
2  1  1
2  1  2
2  1  3
2  2  3
2  2  3
2  2  3

I would like to have this table (data frame) as the output:

1  1  2
1  2  1
2  1  3
2  2  1

Upvotes: 24

Views: 47843

Answers (2)

Jeff
Jeff

Reputation: 128948

In [17]: df
Out[17]: 
    0  1  2
0   1  1  1
1   1  1  1
2   1  1  2
3   1  2  3
4   1  2  3
5   1  2  3
6   2  1  1
7   2  1  2
8   2  1  3
9   2  2  3
10  2  2  3
11  2  2  3

In [19]: df.groupby([0,1])[2].apply(lambda x: len(x.unique()))
Out[19]: 
0  1
1  1    2
   2    1
2  1    3
   2    1
dtype: int64

Upvotes: 21

Roman
Roman

Reputation: 131088

df.groupby(['col1','col2'])['col3'].nunique().reset_index()

Upvotes: 28

Related Questions