Reputation: 2949
Here is the data frame. Which contain some cells having a dictionary in them. I want to convert the dictionary items to columns
dfx={'name':['Alex','Jin',np.nan,'Peter'],
'age':[np.nan,10,12,13],
'other':[{'school':'abc','subject':'xyz'},
np.nan,
{'school':'abc','subject':'xyz'},
np.nan,]
}
dfx=pd.DataFrame(dfx)
Output
name age other
Alex {'school': 'abc', 'subject': 'xyz'}
Jin 10.0
12.0 {'school': 'abc', 'subject': 'xyz'}
Peter 13.0
Here is the Desired output
name age school subject
Alex abc xyz
Jin 10.0
12.0 abc xyz
Peter 13.0
Upvotes: 0
Views: 1172
Reputation: 25239
Try this
df_final = dfx[['name','age']].assign(**pd.DataFrame(dfx.other.to_dict()).T)
Out[41]:
name age school subject
0 Alex NaN abc xyz
1 Jin 10.0 NaN NaN
2 NaN 12.0 abc xyz
3 Peter 13.0 NaN NaN
Upvotes: 2
Reputation: 17804
You can apply Series
to the column with dictionaries:
df.drop('other', 1).join(df['other'].apply(pd.Series).drop(0, 1))
Output:
name age school subject
0 Alex NaN abc xyz
1 Jin 10.0 NaN NaN
2 NaN 12.0 abc xyz
3 Peter 13.0 NaN NaN
Upvotes: 0
Reputation: 13407
You can use the .str.get
accessor to actually index into the dictionaries in your columns. This also returns nan
whenever the cell value is nan
instead of a dictionary:
clean_df = (dfx
.assign(
school=lambda df: df["other"].str.get("school"),
subject=lambda df: df["other"].str.get("subject"))
.drop("other", axis=1))
print(clean_df)
name age school subject
0 Alex NaN abc xyz
1 Jin 10.0 NaN NaN
2 NaN 12.0 abc xyz
3 Peter 13.0 NaN NaN
Upvotes: 2
Reputation: 26676
Create a dictionary
of dfx
'sindex
and other
. pd.DataFrame
dictionary and transpose
. That will give you a new dataframe
. Join the resulting dataframe
to the first two columns of dfx.
dfx.iloc[:,:-1].join(pd.DataFrame(dict(zip(dfx.index,dfx.other))).T).fillna('')
name age school subject
0 Alex abc xyz
1 Jin 10
2 12 abc xyz
3 Peter 13
Upvotes: 1