Unpack list of dictionary into separate columns in Pandas

Question

Let's assume I have data which is structured like such:

{ 
    "_id" : 245, 
    "connId" : "3r34b32", 
    "roomList" : [
        {
            "reportId" : 29, 
            "siteId" : 1
        }]
}

How do I go about getting the keys from the dictionary within the list, along with the value in order for it to have this expected output below.

Expected Output:

ID,connID,reportId,siteId

245,3r34b32,29,1

I'm using json_normalize but this does not work with lists, so I need to add an extra bit of code which can unpack the list that contains a dictionary.

Any help will be appreciated.

Akshay Sehgal · Accepted Answer

You have a nested record. You can handle them separately with record_path and them concatenate them with pd.concat()

root = pd.json_normalize(d).drop('roomList',1)
nested = pd.json_normalize(d, record_path='roomList')
output = pd.concat([root, nested],axis=1)
print(output)

   _id   connId  reportId  siteId
0  245  3r34b32        29       1

Method without explicit column names

If you don't want to specify a column name explicitly you can do the following. This checks which of the columns after normalization are nested, separately normalizes them and concatenates all of them. I have added another nested JSON for sake of example -

root = pd.json_normalize(d)
nested_cols = [i for i in root.columns if isinstance(root[i][0], list)]

l = [root.drop(nested_cols,1),]
for i in nested_cols:
    l.append(pd.json_normalize(d, record_path=i))

output = pd.concat(l, axis=1)
print(output)

   _id   connId  reportId  siteId  reportId2  siteId2
0  245  3r34b32        29       1         39        5

NOTE: If some of your nested records have same names, you many want to use some prefix while normalizing them. Check this for more details.

Unpack list of dictionary into separate columns in Pandas

Answers (1)

Method without explicit column names

Related Questions