TheCluelessCoder
TheCluelessCoder

Reputation: 79

Pandas: How to pivot rows into columns

I am looking for a way to pivot around 600 columns into rows. Here's a sample with only 4 of those columns (good, bad, ok, Horrible):

df:

RecordID   good   bad   ok  Horrible
  A          0      0    1     0 
  B          1      0    0     1

desired output:

 RecordID     Column    Value
   A           Good       0
   A            Bad       0
   A            Ok        1
   A          Horrible    0
   B           Good       1
   B            Bad       0
   B            Ok        0
   B          Horrible    1

Upvotes: 0

Views: 205

Answers (3)

Piotr Żak
Piotr Żak

Reputation: 2132

Adding dataframe:

import pandas as pd
import numpy as np

data2 = {'RecordID': ['a', 'b', 'c'],
        'good': [0, 1, 1],
        'bad': [0, 0, 1],
        'horrible': [0, 1, 1],
        'ok': [1, 0, 0]}
  
# Convert the dictionary into DataFrame
df = pd.DataFrame(data2)

Melt data: https://pandas.pydata.org/docs/reference/api/pandas.melt.html

melted = df.melt(id_vars='RecordID', var_name='Column', value_name='Value')
melted

enter image description here

Optionally: Group By - for summ or mean values:

f2 = melted.groupby(['Column']).sum()
df2

enter image description here

enter image description here

Upvotes: 1

SeaBean
SeaBean

Reputation: 23217

You can use .stack() as follows. Using .stack() is preferred as it naturally resulted in rows already sorted in the order of RecordID so that you don't need to waste processing time sorting on it again, especially important when you have a large number of columns.

df = df.set_index('RecordID').stack().reset_index().rename(columns={'level_1': 'Column', 0: 'Value'})

Output:

  RecordID    Column  Value
0        A      good      0
1        A       bad      0
2        A        ok      1
3        A  Horrible      0
4        B      good      1
5        B       bad      0
6        B        ok      0
7        B  Horrible      1

Upvotes: 1

Arkadiusz
Arkadiusz

Reputation: 1875

You can use melt function:

(df.melt(id_vars='RecordID', var_name='Column', value_name='Value')
   .sort_values('RecordID')
   .reset_index(drop=True)
)

Output:

    RecordID      Column    Value
0          A        good        0
1          A         bad        0
2          A          ok        1
3          A    Horrible        0
4          B        good        1
5          B         bad        0
6          B          ok        0
7          B    Horrible        1

Upvotes: 1

Related Questions