ohai
ohai

Reputation: 183

Transpose Pandas Dataframe Python

I have this following dataframe:

  Status    Percentage  Value   Name    Tahun
0   X       66.666667    4.0     A      2021
1   Y       33.333333    2.0     A      2021
2   Z       0.000000     0.0     A      2021
0   X       25.000000    2.0     A      2020
1   Y       62.500000    5.0     A      2020
2   Z       12.500000    1.0     A      2020           

I want to transpose the dataframe and change the column header to Status values. Ideally the output should look like

X            Y           Z          Type         Name    Tahun
66.666667    33.333333   0.000000   Percentage    A       2021 
4.0          2.0         0.0        Value         A       2021
25.000000    62.500000   12.500000  Percentage    A       2020
2.0          5.0         1.0        Value         A       2020                             
               

I tried this one:

df = df.set_index('Status').T

but I didnt get output as my expected. How can I change the rest of column names?

Upvotes: 1

Views: 1212

Answers (2)

U13-Forward
U13-Forward

Reputation: 71570

Or just use melt and pivot:

(df.melt(['Name', 'Tahun', 'Status'], var_name='Type')
   .pivot('value', ['Name', 'Tahun', 'Type'], 'Status')
   .reset_index()
   .rename_axis(columns=None))

  Name  Tahun        Type          X          Y     Z
0    A   2020  Percentage  25.000000  62.500000  12.5
1    A   2020       Value   2.000000   5.000000   1.0
2    A   2021  Percentage  66.666667  33.333333   0.0
3    A   2021       Value   4.000000   2.000000   0.0

This code melts the dataframe so that the Percentage and Value columns get merged and a new column Type get's created, then it pivots it so that the Status column values become columns.

If there are duplicates:

(df.melt(['Name', 'Tahun', 'Status'], var_name='Type')
   .pivot_table('value', ['Name', 'Tahun', 'Type'], 'Status')
   .reset_index()
   .rename_axis(columns=None))

Difference is that pivot_table has an aggfunc argument, default set to mean, so if there are duplicate values, it will find the average of the other values, whereas pivot doesn't have that argument.

Upvotes: 0

akuiper
akuiper

Reputation: 214957

stack (Percentage and Value) + unstack (Status):

(df.set_index(['Name', 'Tahun', 'Status'])
   .stack()
   .unstack(level='Status')
   .rename_axis(('Name', 'Tahun', 'Type'))
   .reset_index())

Status Name  Tahun        Type          X          Y     Z
0         A   2020  Percentage  25.000000  62.500000  12.5
1         A   2020       Value   2.000000   5.000000   1.0
2         A   2021  Percentage  66.666667  33.333333   0.0
3         A   2021       Value   4.000000   2.000000   0.0

Upvotes: 1

Related Questions