Reputation: 8628
This is my data frame df
:
CITY ID_C
abc 123
abc 123
abc 456
def 123
def 456
def 789
def 789
I need to calculate the number of unique values of ID_C
grouped by CITY
:
CITY TOTAL_UNIQUE_COUNT
abc 2
def 3
I tried this code, but get the error ValueError: cannot insert ID_CITIZEN, already exists
:
df.groupby('CITY').ID_C.value_counts().reset_index()
Upvotes: 1
Views: 40
Reputation:
There is a direct method for that:
df.groupby('CITY')['ID_C'].nunique()
Out:
CITY
abc 2
def 3
Name: ID_C, dtype: int64
For formatting:
df.groupby('CITY')['ID_C'].nunique().to_frame('TOTAL_UNIQUE_COUNT')
Out:
TOTAL_UNIQUE_COUNT
CITY
abc 2
def 3
df.groupby('CITY')['ID_C'].nunique().to_frame('TOTAL_UNIQUE_COUNT').reset_index()
Out:
CITY TOTAL_UNIQUE_COUNT
0 abc 2
1 def 3
Upvotes: 2