How do I reshape this DataFrame in Python?

Question

I have a DataFrame df_sale in Python that I want to reshape, count the sum across the price column and add a new column total.

Here is the df_sale dataframe:

b_no  a_id  price  c_id
120   24     50     2
120   56     100    2
120   90     25     2
120   45     20     2
231   89     55     3
231   45     20     3
231   10     250    3

Excepted output after reshaping:

b_no  a_id_1  a_id_2  a_id_3  a_id_4  total  c_id
120   24      56      90      45      195    2
231   89      45      10      0       325    3

What I have tried so far is use the sum() on df_sale['price'] separately for 120 and 231. I do not understand how should I reshape the data, add new column headers and get the total without being computationally inefficient. Thanks.

sacuL · Accepted Answer

This might not be the cleanest method (at all), but it gets the outcome you want:

reshaped_df = (df.groupby('b_no')[['price', 'c_id']]
               .first()
               .join(df.groupby('b_no')['a_id']
                     .apply(list)
                     .apply(pd.Series)
                     .add_prefix('a_id_'))
               .drop('price',1)
               .join(df.groupby('b_no')['price'].sum().to_frame('total'))
               .fillna(0))


>>> reshaped_df
      c_id  a_id_0  a_id_1  a_id_2  a_id_3  total
b_no                                             
120      2    24.0    56.0    90.0    45.0    195
231      3    89.0    45.0    10.0     0.0    325

How do I reshape this DataFrame in Python?

Answers (2)

Related Questions