Filippo Sebastio
Filippo Sebastio

Reputation: 1112

Assigning ID values to obs that share multiple characteristics

have the followign dataset

data = {'Country': ['UK','Ireland', 'Ireland', 'South Africa','Botswana','Italy','Greece'], 
        'Sub_ISO': ['Europe', 'Europe', 'Europe', 'Southern Africa','Southern Africa','Europe', 'Europe'], 
        'Language': ['EN', 'EN', 'IR',  'EN', 'EN', 'ITA', 'GRE'], 
        'count': [170,170, 170, 65,64,53,150]}
df = pd.DataFrame(data=data)

What I would like to do is to be able to identify with a unique id number those countries that are in the same Sub_ISO and speak the same language. Sorry, I am not sure how to go about it, so I can't really provide much more code.

Expected Output

enter image description here

****EDIT

Ireland and other countries that have more than one language are repeated

Upvotes: 0

Views: 34

Answers (1)

Filippo Sebastio
Filippo Sebastio

Reputation: 1112

This one seems to work!

df['new_id'] = df.groupby(['ISO_Sub_Region','Official language']).ngroup()

Upvotes: 0

Related Questions