Reputation: 720
I have a dataframe like:
ID Sim Items
1 0.345 [7,7]
2 0.604 [2,7,3,8,5]
3 0.082 [9,1,9,1]
I want to form a pivot_table
by:
df.pivot_table(index ="ID" , columns = "Items", values="Sim")
To do that,
I have to extract list elements in items
column and repeat the ID
,Sim
values for each unique elements in row_list.
To be as:
ID Sim Items
1 0.345 7
2 0.604 2
2 0.604 7
2 0.604 3
2 0.604 8
2 0.604 5
3 0.082 9
3 0.082 1
pivot table :
7 2 3 8 5 1 9
1 0.345 - - - - - -
2 0.604 0.604 0.604 0.604 0.604
3 - - - - - 0.082 0.082
Is there any pythonic approach for that? Or any suggestions?
Upvotes: 1
Views: 1405
Reputation: 75100
Use explode(new in pandas 0.25+) before pivot;
df.explode('Items').pivot_table(index ="ID" , columns = "Items", values="Sim")
Items 1 2 3 5 7 8 9
ID
1 NaN NaN NaN NaN 0.345 NaN NaN
2 NaN 0.604 0.604 0.604 0.604 0.604 NaN
3 0.082 NaN NaN NaN NaN NaN 0.082
for lower versions of pandas, you can try with:
(df.drop('Items',1).join(pd.DataFrame(df['Items'].tolist())
.stack(dropna=False).droplevel(1).rename('Items'))
.pivot_table(index ="ID" , columns = "Items", values="Sim"))
Items 1 2 3 5 7 8 9
ID
1 NaN NaN NaN NaN 0.345 NaN NaN
2 NaN 0.604 0.604 0.604 0.604 0.604 NaN
3 0.082 NaN NaN NaN NaN NaN 0.082
If exact ordering matters , use reindex with unique of Items after explode:
(df.explode('Items').pivot_table(index ="ID" , columns = "Items", values="Sim")
.reindex(df.explode('Items')['Items'].unique(),axis=1))
Items 7 2 3 8 5 9 1
ID
1 0.345 NaN NaN NaN NaN NaN NaN
2 0.604 0.604 0.604 0.604 0.604 NaN NaN
3 NaN NaN NaN NaN NaN 0.082 0.082
Upvotes: 2