Reputation: 1
I have a data frame with many column. One of the column is named 'attributes' and in it has a list of dictionary with keys and values. I want to extract each keys and it values to it own column. This is what the data frame look like
Upvotes: 0
Views: 764
Reputation: 2787
The following will add the dictionary keys as additional columns, keeping the attributes
column in the dataframe:
df = pd.concat([df, df["attributes"].apply(pd.Series)], axis=1)
For the nested dictionaries, trying this simple example worked for me (here the initial column of dictionaries is colC
, with the nested dictionaries in foo
):
import pandas as pd
df = pd.DataFrame(
{
'colA': {0: 7, 1: 2, 2: 5, 3: 3, 4: 5},
'colB': {0: 7, 1: 8, 2: 10, 3: 2, 4: 5},
'colC': {
0: {'foo': {"A": 5, "B": 6, "C": 9}, 'bar': 182, 'baz': 148},
1: {'bar': 103, 'baz': 155},
2: {'foo': 165, 'bar': 184, 'baz': 170},
3: {'foo': 121, 'bar': 151, 'baz': 187},
4: {'foo': 137, 'bar': 199, 'baz': 108},
},
}
)
df = pd.concat([df, df["colC"].apply(pd.Series)], axis=1)
# colA colB colC foo bar baz
#0 7 7 {'foo': {'A': 5, 'B': 6, 'C': 9}, 'bar': 182, 'baz': 148} {'A': 5, 'B': 6, 'C': 9} 182.0 148.0
#1 2 8 {'bar': 103, 'baz': 155} NaN 103.0 155.0
#2 5 10 {'foo': 165, 'bar': 184, 'baz': 170} 165 184.0 170.0
#3 3 2 {'foo': 121, 'bar': 151, 'baz': 187} 121 151.0 187.0
#4 5 5 {'foo': 137, 'bar': 199, 'baz': 108} 137 199.0 108.0
df = pd.concat([df, df["foo"].apply(pd.Series)], axis=1)
# colA colB colC foo bar baz 0 A B C
#0 7 7 {'foo': {'A': 5, 'B': 6, 'C': 9}, 'bar': 182, 'baz': 148} {'A': 5, 'B': 6, 'C': 9} 182.0 148.0 NaN 5.0 6.0 9.0
#1 2 8 {'bar': 103, 'baz': 155} NaN 103.0 155.0 NaN NaN NaN NaN
#2 5 10 {'foo': 165, 'bar': 184, 'baz': 170} 165 184.0 170.0 165.0 NaN NaN NaN
#3 3 2 {'foo': 121, 'bar': 151, 'baz': 187} 121 151.0 187.0 121.0 NaN NaN NaN
#4 5 5 {'foo': 137, 'bar': 199, 'baz': 108} 137 199.0 108.0 137.0 NaN NaN NaN
There is the column 0
which appears because of the "empty" rows, but this should not be a problem.
Upvotes: 1