How do I assign labels to values of variables

Question

I have a basic dataset where one of the variables denotes a county in California. This is a continuous variable, with 1 being the first alphabetical county, and 58 being the last alphabetically.

For example:

ID      County  
1         1  
2         58  
3         5  
4         43  
5         2  
6         19  
7         42  
8         2  
9         1  
10        14

In Stata, I would do the following:

label define county_label 1 "Alameda" 2 "Alpine" 3 "Amador" 58 "Yuba"  
label val county county_label

Next, using the tabulate command I get the output below:

ID      County  
1         Alameda  
2         Yuba  
3         Calaveras  
4         Santa Clara  
5         Alpine  
6         Los Angeles  
7         Santa Barbara  
8         Alpine  
9         Alameda  
10        Inyo

In Python, I have tried creating a dictionary as a first step:

county_dictionary = {1 : 'Alameda', 2 : 'Alpine', ......  58 : 'Yuba'}

However, after this I am completely lost; I am not even sure if it is even necessary.

How do I get the same output in Python?

foxpal · Accepted Answer

Try this:

df['County'] = df.apply(lambda x: county_dictionary.get(x['County'], 'Unknown'), axis=1)

How do I assign labels to values of variables

Answers (1)

Related Questions