Reputation: 81
I have a dataframe in which one column I have a list of dictionaries, and in this dictionaries I has the name of the of the columns I want to create and its value.
id stats opta_id
0 1307 [{name: 'speed', value: 5},{name: 'strength', value: 10}....] p176278
1 2410 [{name: 'vision', value: 5}, {name: 'strength', value: 10}....] p118335
2 200 [{name: 'speed', value: 5},{name: 'vision', value: 10}....] p92187
3 3314 [{name: 'speed', value: 5},{name: 'strength', value: 10}....] p154976
4 9223 [{name: 'speed', value: 5},{name: 'strength', value: 10}....] p446990
the list can have up to 80 elements and the length of it is different on each row.
How could flat this column in order to get something similar to this?
id stats.speed stats.strength stats.vision ..... opta_id
0 1307 5 10 nan ..... p176278
1 2410 nan 5 10 ..... p118335
.
.
.
thank you!
Upvotes: 0
Views: 87
Reputation: 81
At the end I have found a solution to my problem. First I created at temporal dataframe with every row of the column
tmp = pd.concat([pd.DataFrame(x) for x in df['stats']], keys=df.index).reset_index(level=1, drop=True)
afterwards, I do pivot_table with the 'name' column and using as value the stat
pivot = pd.pivot_table(tmp, columns='name', index=df_stats.index.values, values='stat')
Upvotes: 0
Reputation: 148870
Here I would first build a temporary dataframe from a list of dict created from the stats column, and then concat it with the remaining columns:
tmp = pd.DataFrame([{d['name']: d['value'] for d in row}
for row in df['stats']]).rename(
columns=lambda x: 'stats.' + x)
df = pd.concat([df['id'], tmp, df['opta_id']], axis=1)
With the shown data, it gives:
id stats.speed stats.strength stats.vision opta_id
0 1307 5.0 10.0 NaN p176278
1 2410 NaN 10.0 5.0 p118335
2 200 5.0 NaN 10.0 p92187
3 3314 5.0 10.0 NaN p154976
4 9223 5.0 10.0 NaN p446990
Upvotes: 1