Reputation: 1112
have the followign dataset
data = {'Country': ['UK','Ireland', 'Ireland', 'South Africa','Botswana','Italy','Greece'],
'Sub_ISO': ['Europe', 'Europe', 'Europe', 'Southern Africa','Southern Africa','Europe', 'Europe'],
'Language': ['EN', 'EN', 'IR', 'EN', 'EN', 'ITA', 'GRE'],
'count': [170,170, 170, 65,64,53,150]}
df = pd.DataFrame(data=data)
What I would like to do is to be able to identify with a unique id number those countries that are in the same Sub_ISO and speak the same language. Sorry, I am not sure how to go about it, so I can't really provide much more code.
Expected Output
****EDIT
Ireland and other countries that have more than one language are repeated
Upvotes: 0
Views: 34
Reputation: 1112
This one seems to work!
df['new_id'] = df.groupby(['ISO_Sub_Region','Official language']).ngroup()
Upvotes: 0