Reputation: 61
I have a function which imports data of one government bond at one date from a CSV file containing multiple gov bonds with of different ranges of maturities:
def importdata (fileloc, date, name):
'Imports data from a given location, date and name'
data = pd.read_csv(fileloc) # file location
date = date
result = data[ (data['date']) == date] # getting the date of the bond
data = result.loc[:, result.columns.str.startswith(name)] # getting the curve wanted at the date
data = data.T # Transposing the data
data = data.reset_index()
data.columns = ['maturity','spot rate'] # renaming columns
data['maturity'] = data.maturity.str.rsplit(n=1).str[-1]
return data
Example of data
:
maturity spot rate
0 1Y 0.081
1 18M 0.164
2 2Y 0.230
3 3Y 0.361
4 4Y 0.479
5 5Y 0.577
6 6Y 0.660
7 7Y 0.732
8 8Y 0.796
9 9Y 0.851
10 10Y 0.900
11 12Y 0.967
12 15Y 1.026
13 20Y 1.044
14 25Y 1.042
15 30Y 1.020
I have added a line of code where it extracts the rows of the dataframe up until a maximum maturity that I will give as an input to the function:
data.iloc[:data.loc[data.maturity.str.contains(max_maturity,na=False)].index[0]]
So now the function looks like this:
def importdata (fileloc, date, name, max_maturity):
'Imports data from a given location, date and name'
data = pd.read_csv(fileloc) # file location
date = date
result = data[ (data['date']) == date] # getting the date of the curve
data = result.loc[:, result.columns.str.startswith(name)] # getting the curve wanted at the date
data = data.T # Transposing the data
data = data.reset_index()
data.columns = ['maturity','spot rate'] # renaming columns
data['maturity'] = data.maturity.str.rsplit(n=1).str[-1]
data = data.iloc[:data.loc[data.maturity.str.contains(max_maturity,na=False)].index[0]]
return data
The only problem is that now with that additional line of code, I can no longer import the full data. Is there a way I can alter the code to allow me to do so, whilst still being able to import only up to a specific maturity if I want?
Upvotes: 0
Views: 273
Reputation: 147
You could set the default of max_maturity to None and add an if statement:
def importdata (fileloc, date, name, max_maturity=None):
'Imports data from a given location, date and name'
data = pd.read_csv(fileloc) # file location
# date = date this does nothing
result = data[ (data['date']) == date] # getting the date of the curve
data = result.loc[:, result.columns.str.startswith(name)] # getting the curve wanted at the date
data = data.T # Transposing the data
data = data.reset_index()
data.columns = ['maturity','spot rate'] # renaming columns
data['maturity'] = data.maturity.str.rsplit(n=1).str[-1]
if max_maturity:
data = data.iloc[:data.loc[data.maturity.str.contains(max_maturity,na=False)].index[0]]
return data
Upvotes: 1