user4063739
user4063739

Reputation:

Is there a way to Ungroup group by data

I would like to ungroup the dataframe after applying group by condition. I used pandas group by & sum to group by. Following is group by table

+------------+--------------+----------------+-----------------+
|   Month    | Amt_8_red_17 | Amt_8_black_29 | Amt_10_black_29 |
+------------+--------------+----------------+-----------------+
| 01-01-2018 |            0 |             12 |              18 |
| 01-02-2018 |          100 |             12 |              12 |
+------------+--------------+----------------+-----------------+

Now I want to ungroup the data. The column names are the combination of columns joined by underscore. eg: Amt_8_red_17 : column names are (Amount,Count,Color,Id) The output looks like

+------------+---------+-------+-------+----+
|   Month    | Amount  | Count | Color | Id |
+------------+---------+-------+-------+----+
| 01-01-2018 |       0 |     8 | red   | 17 |
| 01-01-2018 |      12 |     8 | black | 29 |
| 01-01-2018 |      18 |    10 | black | 29 |
| 01-02-2018 |     100 |     8 | red   | 17 |
| 01-02-2018 |      12 |     8 | black | 29 |
| 01-02-2018 |      12 |    10 | black | 29 |
+------------+---------+-------+-------+----+

Is there any faster way?

Upvotes: 1

Views: 5425

Answers (1)

jezrael
jezrael

Reputation: 862441

First create index by all columns without _ by DataFrame.set_index if necessary.

Then create MultiIndex by str.split and reshape by DataFrame.unstack.

Last data cleaning - remove first level by first DataFrame.reset_index, second is for columns from MultiIndex. Last set new columns names.

df = df.set_index('Month')
df.columns = df.columns.str.split('_', expand=True)
df = df.unstack().reset_index(level=0, drop=True).reset_index()
df.columns= ['Count','Color','Id','Month','Amount']
print (df)
  Count  Color  Id       Month  Amount
0     8    red  17  01-01-2018       0
1     8    red  17  01-02-2018     100
2     8  black  29  01-01-2018      12
3     8  black  29  01-02-2018      12
4    10  black  29  01-01-2018      18
5    10  black  29  01-02-2018      12

Last if necessary change order columns:

df = df[['Month','Amount','Count','Color','Id']]
print (df)
        Month  Amount Count  Color  Id
0  01-01-2018       0     8    red  17
1  01-02-2018     100     8    red  17
2  01-01-2018      12     8  black  29
3  01-02-2018      12     8  black  29
4  01-01-2018      18    10  black  29
5  01-02-2018      12    10  black  29

Upvotes: 4

Related Questions