Reputation: 1
I have a basic dataset where one of the variables denotes a county in California. This is a continuous variable, with 1
being the first alphabetical county, and 58
being the last alphabetically.
For example:
ID County
1 1
2 58
3 5
4 43
5 2
6 19
7 42
8 2
9 1
10 14
In Stata, I would do the following:
label define county_label 1 "Alameda" 2 "Alpine" 3 "Amador" 58 "Yuba"
label val county county_label
Next, using the tabulate
command I get the output below:
ID County
1 Alameda
2 Yuba
3 Calaveras
4 Santa Clara
5 Alpine
6 Los Angeles
7 Santa Barbara
8 Alpine
9 Alameda
10 Inyo
In Python, I have tried creating a dictionary as a first step:
county_dictionary = {1 : 'Alameda', 2 : 'Alpine', ...... 58 : 'Yuba'}
However, after this I am completely lost; I am not even sure if it is even necessary.
How do I get the same output in Python?
Upvotes: 0
Views: 1424
Reputation: 623
Try this:
df['County'] = df.apply(lambda x: county_dictionary.get(x['County'], 'Unknown'), axis=1)
Upvotes: 1