Reputation: 5
I have a dictionary whose keys are some user IDs and values are lists of dictionaries, take one key-value pair for example:
my_dict['10020'] = [{'type': 'phone', 'count': 3},
{'type': 'id_card', 'count': 1},
{'type': 'email', 'count': 2}]
Now I would like to create a pandas DataFrame, each row for a key-value pair, columns are the 'type' field within the list of dictionaries above, and values are the 'count' field respectively, like:
ID phone id_card email
10020 3 1 2
I have no idea how many potential 'types' are there in the dictionary, so instead of traversing the dictionary and get all 'types', is there a handy way to get the job done?
Upvotes: 0
Views: 436
Reputation: 403128
Consider some data d
with variable types:
d = \
{
"10021": [
{
"type": "fax",
"count": 33
},
{
"type": "email",
"count": 22
}
],
"10020": [
{
"type": "phone",
"count": 3
},
{
"type": "id_card",
"count": 1
},
{
"type": "email",
"count": 2
}
]
}
Reshape your data as such:
r = [{'id' : k, 'counts' : d[k]} for k in d]
Now, use json_normalize
+ pivot
:
df = pd.io.json.json_normalize(r, 'counts', 'id').pivot('id', 'type', 'count')
df
type email fax id_card phone
id
10020 2.0 NaN 1.0 3.0
10021 22.0 33.0 NaN NaN
This should work for any type
in your data.
Upvotes: 1
Reputation: 323376
Data input
d={'10020': [{'type': 'phone', 'count': 3},
{'type': 'id_card', 'count': 1},
{'type': 'email', 'count': 2}],
'10021': [{'type': 'phone', 'count': 33},
{'type': 'id_card', 'count': 11},
{'type': 'email', 'count': 22}]
}
Then we using pd.concate
pd.concat([pd.DataFrame(y).set_index('type').rename(columns={'count':x}).T for x,y in d.items()])
Out[480]:
type phone id_card email
10020 3 1 2
10021 33 11 22
Upvotes: 2