Lee
Lee

Reputation: 57

NaN error from .map on a column in a dataframe

I have a dataframe that I'm working with that contains a column that has state names spelled out and Im' trying to convert that into the two letter abbreviation form. I found a separate cvs file with all the state names and converted it into a dictionary. I then tried to use that dictionary to map the column but got NaN errors for my output columns.

The original dataframe I had contains a column with city and state grouped together. I've split them into two separate columns and the state is the one that I'm playing around with.

Here's what my dataframe looks like after I've split them:

print(newtop50.head())
                    city_state     2018         city        state
11698       New York, New York  8398748     New York     New York
1443   Los Angeles, California  3990456  Los Angeles   California
3415         Chicago, Illinois  2705994      Chicago     Illinois
17040           Houston, Texas  2325502      Houston        Texas
665           Phoenix, Arizona  1660272      Phoenix      Arizona

This is what a few rows of my dictionary looks like:

print(states_dic)
{'Alabama': 'AL', 'Alaska': 'AK', 'Arizona': 'AZ', 'Arkansas': 'AR', 'California': 'CA', 'Colorado': 'CO', 'Connecticut': 'CT', 'Delaware': 'DE', 'District of Columbia': 'DC', 'Florida': 'FL', 'Georgia': 'GA', 'Hawaii': 'HI', 'Idaho': 'ID'

Here's what I've tried:

newtop50['state'] = newtop50['state'].map(states_dic)

print(newtop50.head())
                    city_state     2018         city state
11698       New York, New York  8398748     New York   NaN
1443   Los Angeles, California  3990456  Los Angeles   NaN
3415         Chicago, Illinois  2705994      Chicago   NaN
17040           Houston, Texas  2325502      Houston   NaN
665           Phoenix, Arizona  1660272      Phoenix   NaN

Not quite sure what I'm missing here?

Upvotes: 1

Views: 614

Answers (2)

anky
anky

Reputation: 75100

Incase you dont want to manually create the mapping(as the example has missing values) , you can use this module:

import us
states_dic=us.states.mapping('name', 'abbr')

df.state.map(states_dic)

11698    NY
1443     CA
3415     IL
17040    TX
665      AZ

Upvotes: 1

Vishnudev Krishnadas
Vishnudev Krishnadas

Reputation: 10960

You have explained that you have split the city_state column into city and state. For map to work, the value must be an exact match. What I speculate is that you have spaces on either side of the state series.

Try doing

newtop50['state'].str.strip().map(states_dic)

Upvotes: 1

Related Questions