Ngoc Chau
Ngoc Chau

Reputation: 1

Create new column using keys pair value from a dataframe column

I have a data frame with many column. One of the column is named 'attributes' and in it has a list of dictionary with keys and values. I want to extract each keys and it values to it own column. This is what the data frame look like

Upvotes: 0

Views: 764

Answers (1)

Rawson
Rawson

Reputation: 2787

The following will add the dictionary keys as additional columns, keeping the attributes column in the dataframe:

df = pd.concat([df, df["attributes"].apply(pd.Series)], axis=1)

Edit

For the nested dictionaries, trying this simple example worked for me (here the initial column of dictionaries is colC, with the nested dictionaries in foo):

import pandas as pd

df = pd.DataFrame(
    {
        'colA': {0: 7, 1: 2, 2: 5, 3: 3, 4: 5},
        'colB': {0: 7, 1: 8, 2: 10, 3: 2, 4: 5},
        'colC': {
            0: {'foo': {"A": 5, "B": 6, "C": 9}, 'bar': 182, 'baz': 148},
            1: {'bar': 103, 'baz': 155},
            2: {'foo': 165, 'bar': 184, 'baz': 170},
            3: {'foo': 121, 'bar': 151, 'baz': 187},
            4: {'foo': 137, 'bar': 199, 'baz': 108},
        },
    }
)

df = pd.concat([df, df["colC"].apply(pd.Series)], axis=1)
#   colA  colB                                                       colC                       foo    bar    baz
#0     7     7  {'foo': {'A': 5, 'B': 6, 'C': 9}, 'bar': 182, 'baz': 148}  {'A': 5, 'B': 6, 'C': 9}  182.0  148.0
#1     2     8                                   {'bar': 103, 'baz': 155}                       NaN  103.0  155.0
#2     5    10                       {'foo': 165, 'bar': 184, 'baz': 170}                       165  184.0  170.0
#3     3     2                       {'foo': 121, 'bar': 151, 'baz': 187}                       121  151.0  187.0
#4     5     5                       {'foo': 137, 'bar': 199, 'baz': 108}                       137  199.0  108.0

df = pd.concat([df, df["foo"].apply(pd.Series)], axis=1)
#   colA  colB                                                       colC                       foo    bar    baz      0    A    B    C
#0     7     7  {'foo': {'A': 5, 'B': 6, 'C': 9}, 'bar': 182, 'baz': 148}  {'A': 5, 'B': 6, 'C': 9}  182.0  148.0    NaN  5.0  6.0  9.0
#1     2     8                                   {'bar': 103, 'baz': 155}                       NaN  103.0  155.0    NaN  NaN  NaN  NaN
#2     5    10                       {'foo': 165, 'bar': 184, 'baz': 170}                       165  184.0  170.0  165.0  NaN  NaN  NaN
#3     3     2                       {'foo': 121, 'bar': 151, 'baz': 187}                       121  151.0  187.0  121.0  NaN  NaN  NaN
#4     5     5                       {'foo': 137, 'bar': 199, 'baz': 108}                       137  199.0  108.0  137.0  NaN  NaN  NaN

There is the column 0 which appears because of the "empty" rows, but this should not be a problem.

Upvotes: 1

Related Questions