Reputation: 1117
I am trying to reshape the dataframe below.
Tenor 2013M06D12 2013M06D13 2013M06D14 \
1 1 1.24 1.26 1.23
4 2 2.01 0.43 0.45
5 3 1.21 2.24 1.03
8 4 0.39 2.32 1.23
So, that it looks as follows. I was looking at using pivot_table
, but this is sort of the opposite of what that would do as I need to convert Column Headers to rows and not the other way around. Hence, I am not sure how to proceed in order to obtain this dataframe.
Date Tenor Rate
1 2013-06-12 1 1.24
2 2013-06-13 1 1.26
4 2013-06-14 1 1.23
The code just involves reading from a CSV:
result = pd.DataFrame.read_csv("BankofEngland.csv")
Upvotes: 2
Views: 99
Reputation: 765
df1.set_index("Tenor").stack().reset_index().add_prefix("col_").sql().select("strptime(col_level_1,'%YM%mD%d')::date Date,col_Tenor Tenor,col_0 Rate").to_df()
Out[104]:
Date Tenor Rate
0 2013-06-12 1 1.24
1 2013-06-13 1 1.26
2 2013-06-14 1 1.23
3 2013-06-12 2 2.01
4 2013-06-13 2 0.43
5 2013-06-14 2 0.45
6 2013-06-12 3 1.21
7 2013-06-13 3 2.24
8 2013-06-14 3 1.03
9 2013-06-12 4 0.39
10 2013-06-13 4 2.32
11 2013-06-14 4 1.23
Upvotes: 0
Reputation: 24742
import pandas as pd
import numpy as np
# try to read your sample data, replace with your read_csv func
df = pd.read_clipboard()
Out[139]:
Tenor 2013M06D12 2013M06D13 2013M06D14
1 1 1.24 1.26 1.23
4 2 2.01 0.43 0.45
5 3 1.21 2.24 1.03
8 4 0.39 2.32 1.23
# reshaping
df.set_index('Tenor', inplace=True)
df = df.stack().reset_index()
df.columns=['Tenor', 'Date', 'Rate']
# suggested by DSM, use the date parser
df.Date = pd.to_datetime(df.Date, format='%YM%mD%d')
Out[147]:
Tenor Date Rate
0 1 2013-06-12 1.24
1 1 2013-06-13 1.26
2 1 2013-06-14 1.23
3 2 2013-06-12 2.01
4 2 2013-06-13 0.43
.. ... ... ...
7 3 2013-06-13 2.24
8 3 2013-06-14 1.03
9 4 2013-06-12 0.39
10 4 2013-06-13 2.32
11 4 2013-06-14 1.23
[12 rows x 3 columns]
Upvotes: 1
Reputation: 353099
I think you can do with with a melt, a sort, a date parse, and some column shuffling:
dfm = pd.melt(df, id_vars="Tenor", var_name="Date", value_name="Rate")
dfm = dfm.sort("Tenor").reset_index(drop=True)
dfm["Date"] = pd.to_datetime(dfm["Date"], format="%YM%mD%d")
dfm = dfm[["Date", "Tenor", "Rate"]]
produces
In [104]: dfm
Out[104]:
Date Tenor Rate
0 2013-06-12 1 1.24
1 2013-06-13 1 1.26
2 2013-06-14 1 1.23
3 2013-06-12 2 2.01
4 2013-06-13 2 0.43
5 2013-06-14 2 0.45
6 2013-06-12 3 1.21
7 2013-06-13 3 2.24
8 2013-06-14 3 1.03
9 2013-06-12 4 0.39
10 2013-06-13 4 2.32
11 2013-06-14 4 1.23
Upvotes: 1