KieranPC
KieranPC

Reputation: 9095

How to go from relative dates to absolute dates in DataFrame columns

I have a pandas DataFrame containing forward prices for future maturities, quoted on multiple different trading months ('trade date'). Trade dates are given in absolute terms ('January'). The maturities are given in relative terms ('M+1').

How can I convert the maturities into an absolute format, i.e. in trade date 'January' the maturity 'M+1' should say 'February'.

Here is example data:

import pandas as pd
import numpy as np

data_keys = ['trade date', 'm+1', 'm+2', 'm+3']
data = {'trade date':['jan','feb','mar','apr'],
        'm+1':np.random.randn(4),
        'm+2':np.random.randn(4),
        'm+3':np.random.randn(4)}

df = pd.DataFrame(data)
df = df[data_keys]

Starting data:

  trade date       m+1       m+2       m+3
0        jan -0.446535 -1.012870 -0.839881
1        feb  0.013255  0.265500  1.130098
2        mar  0.406562 -1.122270 -1.851551
3        apr -0.890004  0.752648  0.778100

Result:

Should have Feb, Mar, Apr, May, Jun, Jul in the columns. NaN will be shown in many instances.

Upvotes: 1

Views: 341

Answers (2)

Alexander
Alexander

Reputation: 109546

You question doesn't contain enough information to answer it.

You say that the prices are quoted on dates given in absolute terms ('January').

January is not a date, but 2-Jan-2015 is.

What is your actual 'date' and what is its format (i.e. text, datetime.date, pd.Timestamp, etc.). You can use type(date) to check where date is an instance of whatever quote date it represents.

The easiest solution is to get your trade dates into pd.Timestamps and then add an offset:

trade_date = pd.Timestamp('2015-1-15')

>>> trade_date + pd.DateOffset(months=1)
Timestamp('2015-02-15 00:00:00')

Upvotes: 0

KieranPC
KieranPC

Reputation: 9095

The starting DataFrame:

  trade date       m+1       m+2       m+3
0        jan -1.350746  0.948835  0.579352
1        feb  0.011813  2.020158 -1.221110
2        mar -0.183187 -0.303099  1.323092
3        apr  0.081105  0.662628 -0.703152

Solution:

  • Define a list of all possible absolute dates you will encounter, in chronological order. Do the same for relative dates.
  • Create a function to act on groups coming from df.groupby. The function will convert the column names of each group appropriately to an absolute format.
  • Apply the function.
  • Pandas handles the clever concatenation of all groups.

Code:

abs_in_order = ['jan','feb','mar','apr','may','jun','jul','aug']
rel_in_order = ['m+0','m+1','m+2','m+3','m+4']

def rel2abs(group, abs_in_order, rel_in_order):
    abs_date = group['trade date'].unique()[0]    
    l = len(rel_in_order)
    i = abs_in_order.index(abs_date)
    namesmap = dict(zip(rel_in_order, abs_in_order[i:i+l]))
    group.rename(columns=namesmap, inplace=True)
    return group

grouped = df.groupby(['trade date'])
df = grouped.apply(rel2abs, abs_in_order, rel_in_order)

Pandas may mess up the column order. Do this to get back to something in chronological order:

order = ['trade date'] + abs_in_order
cols = [e for e in order if e in df.columns]
df[cols]

Result:

  trade date       feb       mar       apr       may       jun       jul
0        jan -1.350746  0.948835  0.579352       NaN       NaN       NaN
1        feb       NaN  0.011813  2.020158 -1.221110       NaN       NaN
2        mar       NaN       NaN -0.183187 -0.303099  1.323092       NaN
3        apr       NaN       NaN       NaN  0.081105  0.662628 -0.703152

Upvotes: 1

Related Questions