Reputation: 1
I'm fairly new to python and would appreciate if someone can guide me in the right direction.
I have a dataset that has unique trades in each row. I need to find all rows that match on certain conditions. Basically, find any offsetting trades that fit a certain condition. For example:
Find trades that have the same REF_RATE, RECEIVE is within a difference of 5, MATURITY_DATE is with 7 days of each other. I have attached the image of data.
Thank You.
Upvotes: 0
Views: 51
Reputation: 967
You can use groupby
to achieve this. As per you requirement specific to this ask Find trades that have the same REF_RATE, RECEIVE is within a difference of 5, MATURITY_DATE is with 7 days of each other
you can proceed like this.
#sample data created from the image of your dataset
>>> data = {'Maturity_Date':['2/01/2021','10/01/2021','10/01/2021','6/06/2021'],'Trade_id':['10484','12880','11798','19561'],'REF_RATE':['BBSW','BBSW','OIS','BBSW'],'Recive':[1.5,1.25,2,10]}
>>> df = pd.DataFrame(data)
>>> df
Maturity_Date Trade_id REF_RATE Recive
0 2/01/2021 10484 BBSW 1.50
1 10/01/2021 12880 BBSW 1.25
2 10/01/2021 11798 OIS 2.00
3 6/06/2021 19561 BBSW 10.00
#convert Maturity_Date to datetime format and sort REF_RATE by date if needed
>>> df['Maturity_Date'] = pd.to_datetime(df['Maturity_Date'], dayfirst=True)
>>> df['Maturity_Date'] = df.groupby('REF_RATE')['Maturity_Date'].apply(lambda x: x.sort_values()) #if needed
>>> df
Maturity_Date Trade_id REF_RATE Recive
0 2021-01-02 10484 BBSW 1.50
1 2021-01-10 12880 BBSW 1.25
2 2021-01-10 11798 OIS 2.00
3 2021-06-06 19561 BBSW 10.00
#groupby of REF_RATE and apply condition on date and receive column
>>> df['date_diff>7'] = df.groupby('REF_RATE')['Maturity_Date'].diff() / np.timedelta64(1, 'D') > 7
>>> df['rate_diff>5'] = df.groupby('REF_RATE')['Recive'].diff() > 5
>>> df
Maturity_Date Trade_id REF_RATE Recive date_diff>7 rate_diff>5
0 2021-01-02 10484 BBSW 1.50 False False
1 2021-01-10 12880 BBSW 1.25 True False #date_diff true as for BBSW Maturity date is more than 7
2 2021-01-10 11798 OIS 2.00 False False
3 2021-06-06 19561 BBSW 10.00 True True #rate_diff and date_diff true because date>7 and receive difference>5
Upvotes: 1