Irina Rapoport
Irina Rapoport

Reputation: 1692

How do I add a column to a DataFrame that is a derivative of another column

Sorry about a newbie question, I am making baby steps in Python. My DataFrame has a column address of type object. address has a country, like this: {... "city": "...", "state": "...", "country": "..."} . How do I add a column country that's derived from the column address?

Upvotes: 0

Views: 138

Answers (1)

ThePyGuy
ThePyGuy

Reputation: 18446

Without the data its difficult to answer, but if the values are Python dict, applying a pandas Series on rows should work:

df['address'].apply(pd.Series)

You will have to assign the result back to the original dataframe, and if the values are JSON string, you may first want to convert it to dictionary using json.loads

SAMPLE RUN:

>>> df
   x                                                     address
0  1  {'city': 'xyz', 'state': 'Massachusetts', 'country': 'US'}
1  2         {'city': 'ABC', 'state': 'LONDON', 'country': 'UK'}

>>> df.assign(country=df['address'].apply(pd.Series)['country'])
   x                                                     address country
0  1  {'city': 'xyz', 'state': 'Massachusetts', 'country': 'US'}      US
1  2         {'city': 'ABC', 'state': 'LONDON', 'country': 'UK'}      UK

Even better to use key directly along with Series.str:

>>> df.assign(country=df['address'].str['country'])

   x                                                     address country
0  1  {'city': 'xyz', 'state': 'Massachusetts', 'country': 'US'}      US
1  2         {'city': 'ABC', 'state': 'LONDON', 'country': 'UK'}      UK

Upvotes: 1

Related Questions