NaN error from .map on a column in a dataframe

Question

I have a dataframe that I'm working with that contains a column that has state names spelled out and Im' trying to convert that into the two letter abbreviation form. I found a separate cvs file with all the state names and converted it into a dictionary. I then tried to use that dictionary to map the column but got NaN errors for my output columns.

The original dataframe I had contains a column with city and state grouped together. I've split them into two separate columns and the state is the one that I'm playing around with.

Here's what my dataframe looks like after I've split them:

print(newtop50.head())
                    city_state     2018         city        state
11698       New York, New York  8398748     New York     New York
1443   Los Angeles, California  3990456  Los Angeles   California
3415         Chicago, Illinois  2705994      Chicago     Illinois
17040           Houston, Texas  2325502      Houston        Texas
665           Phoenix, Arizona  1660272      Phoenix      Arizona

This is what a few rows of my dictionary looks like:

print(states_dic)
{'Alabama': 'AL', 'Alaska': 'AK', 'Arizona': 'AZ', 'Arkansas': 'AR', 'California': 'CA', 'Colorado': 'CO', 'Connecticut': 'CT', 'Delaware': 'DE', 'District of Columbia': 'DC', 'Florida': 'FL', 'Georgia': 'GA', 'Hawaii': 'HI', 'Idaho': 'ID'

Here's what I've tried:

newtop50['state'] = newtop50['state'].map(states_dic)

print(newtop50.head())
                    city_state     2018         city state
11698       New York, New York  8398748     New York   NaN
1443   Los Angeles, California  3990456  Los Angeles   NaN
3415         Chicago, Illinois  2705994      Chicago   NaN
17040           Houston, Texas  2325502      Houston   NaN
665           Phoenix, Arizona  1660272      Phoenix   NaN

Not quite sure what I'm missing here?

Vishnudev Krishnadas · Accepted Answer

You have explained that you have split the city_state column into city and state. For map to work, the value must be an exact match. What I speculate is that you have spaces on either side of the state series.

Try doing

newtop50['state'].str.strip().map(states_dic)

NaN error from .map on a column in a dataframe

Answers (2)

Related Questions