Reputation: 1891
Assume I have a data with this structure below, how would one explode the column that contains a list then unpack the exploded column?
Source:
d = {
"_id" : "5f2",
"connId" : 128,
"hospitalList" : [
{
"hospitalId" : 29,
"boardId" : 1019,
"siteId" : 1
},
{
"hospitalId" : 3091,
"boardId" : 2163,
"siteId" : 382
},
{
"hospitalId" : 28,
"boardId" : 1017,
"siteId" : 5
}]
}
Code:
root = pd.json_normalize(d)
nested_cols = [i for i in root.columns if isinstance(root[i][0], list)]
l = [root.drop(nested_cols,1),]
for i in nested_cols:
l.append(pd.json_normalize(d, record_path=i))
output = pd.concat(l, axis=1)
print(output)
Actual Result:
_id connId hospitalId boardId siteId
0 5f2 128.0 29 1019 1
1 NaN NaN 3091 2163 382
2 NaN NaN 28 1017 5
Expected Result:
_id connId hospitalId boardId siteId
0 5f2 128.0 29 1019 1
1 5f2 128.0 3091 2163 382
2 5f2 128.0 28 1017 5
Upvotes: 0
Views: 124
Reputation: 1950
This outputs what you want.
root = pd.json_normalize(d)
nested_cols = [i for i in root.columns if isinstance(root[i][0], list)]
l = [root.drop(nested_cols,1),]
for i in nested_cols:
l.append(pd.json_normalize(d, record_path=i))
output = pd.concat(l, axis=1)
output.fillna(method='ffill', inplace=True)
Though, unfortunately, I can't know under which circumstances you're going to use the code, and/or if you'll have to make adjustments.
Upvotes: 1