Pandas create dataframe from lists of dictionaries

Question

I have a dictionary whose keys are some user IDs and values are lists of dictionaries, take one key-value pair for example:

my_dict['10020'] = [{'type': 'phone', 'count': 3},
                    {'type': 'id_card', 'count': 1},
                    {'type': 'email', 'count': 2}]

Now I would like to create a pandas DataFrame, each row for a key-value pair, columns are the 'type' field within the list of dictionaries above, and values are the 'count' field respectively, like:

    ID    phone    id_card    email
    10020    3           1        2

I have no idea how many potential 'types' are there in the dictionary, so instead of traversing the dictionary and get all 'types', is there a handy way to get the job done?

cs95 · Accepted Answer

Consider some data d with variable types:

d = \
{
    "10021": [
        {
            "type": "fax",
            "count": 33
        },
        {
            "type": "email",
            "count": 22
        }
    ],
    "10020": [
        {
            "type": "phone",
            "count": 3
        },
        {
            "type": "id_card",
            "count": 1
        },
        {
            "type": "email",
            "count": 2
        }
    ]
}

Reshape your data as such:

r = [{'id' : k, 'counts' : d[k]} for k in d]

Now, use json_normalize + pivot:

df = pd.io.json.json_normalize(r, 'counts', 'id').pivot('id', 'type', 'count')
df

type   email   fax  id_card  phone
id                                
10020    2.0   NaN      1.0    3.0
10021   22.0  33.0      NaN    NaN

This should work for any type in your data.

Pandas create dataframe from lists of dictionaries

Answers (2)

Related Questions