Jojo
Jojo

Reputation: 1117

Re-shaping Dataframe so that Column Headers are made into Rows

I am trying to reshape the dataframe below.

          Tenor   2013M06D12   2013M06D13   2013M06D14   \
1         1           1.24         1.26         1.23   
4         2           2.01         0.43         0.45   
5         3           1.21         2.24         1.03   
8         4           0.39         2.32         1.23  

So, that it looks as follows. I was looking at using pivot_table, but this is sort of the opposite of what that would do as I need to convert Column Headers to rows and not the other way around. Hence, I am not sure how to proceed in order to obtain this dataframe.

           Date        Tenor           Rate
1    2013-06-12           1            1.24        
2    2013-06-13           1            1.26          
4    2013-06-14           1            1.23            

The code just involves reading from a CSV:

result = pd.DataFrame.read_csv("BankofEngland.csv")

Upvotes: 2

Views: 99

Answers (3)

G.G
G.G

Reputation: 765

df1.set_index("Tenor").stack().reset_index().add_prefix("col_").sql().select("strptime(col_level_1,'%YM%mD%d')::date Date,col_Tenor Tenor,col_0 Rate").to_df()

Out[104]: 
         Date  Tenor  Rate
0  2013-06-12      1  1.24
1  2013-06-13      1  1.26
2  2013-06-14      1  1.23
3  2013-06-12      2  2.01
4  2013-06-13      2  0.43
5  2013-06-14      2  0.45
6  2013-06-12      3  1.21
7  2013-06-13      3  2.24
8  2013-06-14      3  1.03
9  2013-06-12      4  0.39
10 2013-06-13      4  2.32
11 2013-06-14      4  1.23

Upvotes: 0

Jianxun Li
Jianxun Li

Reputation: 24742

import pandas as pd
import numpy as np


# try to read your sample data, replace with your read_csv func
df = pd.read_clipboard()

Out[139]: 
   Tenor  2013M06D12  2013M06D13  2013M06D14
1      1        1.24        1.26        1.23
4      2        2.01        0.43        0.45
5      3        1.21        2.24        1.03
8      4        0.39        2.32        1.23


# reshaping
df.set_index('Tenor', inplace=True)
df = df.stack().reset_index()
df.columns=['Tenor', 'Date', 'Rate']
# suggested by DSM, use the date parser
df.Date = pd.to_datetime(df.Date, format='%YM%mD%d')

Out[147]: 
    Tenor       Date  Rate
0       1 2013-06-12  1.24
1       1 2013-06-13  1.26
2       1 2013-06-14  1.23
3       2 2013-06-12  2.01
4       2 2013-06-13  0.43
..    ...        ...   ...
7       3 2013-06-13  2.24
8       3 2013-06-14  1.03
9       4 2013-06-12  0.39
10      4 2013-06-13  2.32
11      4 2013-06-14  1.23

[12 rows x 3 columns]

Upvotes: 1

DSM
DSM

Reputation: 353099

I think you can do with with a melt, a sort, a date parse, and some column shuffling:

dfm = pd.melt(df, id_vars="Tenor", var_name="Date", value_name="Rate")
dfm = dfm.sort("Tenor").reset_index(drop=True)
dfm["Date"] = pd.to_datetime(dfm["Date"], format="%YM%mD%d")
dfm = dfm[["Date", "Tenor", "Rate"]]

produces

In [104]: dfm
Out[104]: 
         Date  Tenor  Rate
0  2013-06-12      1  1.24
1  2013-06-13      1  1.26
2  2013-06-14      1  1.23
3  2013-06-12      2  2.01
4  2013-06-13      2  0.43
5  2013-06-14      2  0.45
6  2013-06-12      3  1.21
7  2013-06-13      3  2.24
8  2013-06-14      3  1.03
9  2013-06-12      4  0.39
10 2013-06-13      4  2.32
11 2013-06-14      4  1.23

Upvotes: 1

Related Questions