Reputation: 83
I have table in the following format:
user_id total_spent purchase_date purchase_id aisle department item_cost
33 279.3 1/29/18 26 gn 1 25.7
33 279.3 1/29/18 26 aj 2 17
33 279.3 1/29/18 26 bs 2 31.7
33 279.3 1/29/18 26 jt 3 30
33 279.3 1/29/18 26 tj 3 8.1
33 279.3 1/29/18 26 ga 4 60.6
33 279.3 1/29/18 26 tw 4 35.9
33 279.3 1/29/18 26 aw 5 34.1
33 279.3 1/29/18 26 te 5 36.2
26 331.7 11/15/17 85 je 1 93.5
26 331.7 11/15/17 85 tg 2 21.1
26 331.7 11/15/17 85 ds 2 23.2
26 331.7 11/15/17 85 jw 3 48.8
26 331.7 11/15/17 85 df 3 10.2
26 331.7 11/15/17 85 lj 4 56.2
26 331.7 11/15/17 85 oa 4 7
26 331.7 11/15/17 85 aw 5 32.4
26 331.7 11/15/17 85 cm 5 39.3
And would like to convert it into each department value as a column name like department_1, department_2 etc. Same as aisle and item cost: dept_aisle_gn, dept_aisle_df etc followed by corresponding cost as columns.
In other words, in the end, I would like to have 2 rows for 2 user_ids and the rest of the information will be shown in the columns rather than rows.
Originally I pulled the data from redshift and I will manipulate it with python pandas. Unfortunately, pivot or crosstab is not working for redshift. I tried pandas pivot but seems not working for me.
Thank you.
Upvotes: 0
Views: 119
Reputation: 164623
pd.pivot_table
supports this.
pd.pivot_table(df,
index=['user_id', 'total_spent', 'purchase_date', 'purchase_id'],
columns=['aisle', 'department'],
values='item_cost',
aggfunc=np.sum,
fill_value=0)
# aisle aj aw bs cm df \
# department 2 5 2 5 3
# user_id total_spent purchase_date purchase_id
# 26 331.7 11/15/17 85 0 32.4 0.0 39.3 10.2
# 33 279.3 1/29/18 26 17 34.1 31.7 0.0 0.0
# aisle ds ga gn je jt \
# department 2 4 1 1 3
# user_id total_spent purchase_date purchase_id
# 26 331.7 11/15/17 85 23.2 0.0 0.0 93.5 0
# 33 279.3 1/29/18 26 0.0 60.6 25.7 0.0 30
# aisle jw lj oa te tg tj \
# department 3 4 4 5 2 3
# user_id total_spent purchase_date purchase_id
# 26 331.7 11/15/17 85 48.8 56.2 7 0.0 21.1 0.0
# 33 279.3 1/29/18 26 0.0 0.0 0 36.2 0.0 8.1
# aisle tw
# department 4
# user_id total_spent purchase_date purchase_id
# 26 331.7 11/15/17 85 0.0
# 33 279.3 1/29/18 26 35.9
Upvotes: 1