Boris
Boris

Reputation: 2521

Transposing group of data in pandas dataframe

I have a large dataframe like this:

|type| qt  | vol|
|----|---- | -- |
| A  | 1   | 10 |
| A  | 2   | 12 |
| A  | 1   | 12 |
| B  | 3   | 11 |
| B  | 4   | 20 |
| B  | 4   | 20 |
| C  | 4   | 20 |
| C  | 4   | 20 |
| C  | 4   | 20 |
| C  | 4   | 20 |

How can I transpose to the dataframe with grouping horizontally like that?

|A.            |B.            |C.            |
|--------------|--------------|--------------|
|type| qt | vol|type| qt | vol|type| qt | vol|
|----|----| ---|----|----| ---|----|----| ---|
| A  | 1  | 10 | B  | 3  | 11 | C  | 4  | 20 |
| A  | 2  | 12 | B  | 4  | 20 | C  | 4  | 20 |
| A  | 1  | 12 | B  | 4  | 20 | C  | 4  | 20 |
                              | C  | 4  | 20 |

Upvotes: 3

Views: 169

Answers (2)

Quang Hoang
Quang Hoang

Reputation: 150735

This is pretty much pivot by one column:

(df.assign(idx=df.groupby('type').cumcount())
   .pivot(index='idx',columns='type', values=df.columns)
   .swaplevel(0,1, axis=1)
   .sort_index(axis=1)
)

Output:

type    A              B            C         
       qt type  vol   qt type  vol qt type vol
idx                                           
0       1    A   10    3    B   11  4    C  20
1       2    A   12    4    B   20  4    C  20
2       1    A   12    4    B   20  4    C  20
3     NaN  NaN  NaN  NaN  NaN  NaN  4    C  20

Upvotes: 2

Shubham Sharma
Shubham Sharma

Reputation: 71689

You can group the dataframe on type then create key-value pairs of groups inside a dict comprehension, finally use concat along axis=1 and pass the optional keys parameter to get the final result:

d = {k:g.reset_index(drop=True) for k, g in df.groupby('type')}
pd.concat(d.values(), keys=d.keys(), axis=1)

Alternatively you can use groupby + cumcount to create a sequential counter per group, then create a multilevel index having two levels where the first level is counter and second level is column type itself, finally use stack followed by unstack to reshape:

c = df.groupby('type').cumcount()
df.set_index([c, df['type'].values]).stack().unstack([1, 2])

     A              B              C       
  type   qt  vol type   qt  vol type qt vol
0    A    1   10    B    3   11    C  4  20
1    A    2   12    B    4   20    C  4  20
2    A    1   12    B    4   20    C  4  20
3  NaN  NaN  NaN  NaN  NaN  NaN    C  4  20

Upvotes: 6

Related Questions