Reputation: 906
my df looks like this and is rather large:
contract time Open High Low Last
0 CME/TYH2018 2017-09-18 125.687500 125.750000 125.687500 125.750000
1 CME/TYH2018 2017-09-20 125.703125 125.750000 125.234375 125.375000
2 CME/TYH2018 2017-09-22 125.609375 125.609375 125.437500 125.484375
3 CME/TYH2018 2017-09-25 125.687500 125.812500 125.687500 125.765625
4 CME/TYH2018 2017-09-26 125.640625 125.796875 125.562500 125.625000
5 CME/TYH2018 2017-09-27 125.171875 125.218750 125.031250 125.125000
371 CME/TYZ2018 2018-07-12 119.984375 120.062500 119.859375 120.015625
372 CME/TYZ2018 2018-07-13 120.156250 120.234375 120.078125 120.218750
373 CME/TYZ2018 2018-07-16 120.000000 120.031250 119.859375 120.000000
374 CME/TYZ2018 2018-07-17 119.968750 120.046875 119.890625 119.953125
375 CME/TYZ2018 2018-07-18 119.875000 120.062500 119.843750 119.890625
I am looking to slice the data as follows. For every unique contract take a slice like this:
start of data for each contract:
df.loc[df.contract=='CME/TYH2018'].time.max() - datetime.timedelta(days=100)
and discard all other rows.
Upvotes: 2
Views: 41
Reputation: 862511
Use GroupBy.transform
with max
for Series
with same size like DataFrame
, substract timedelta and last filter by boolean indexing
:
shifted = df.groupby('contract')['time'].transform('max') - pd.Timedelta(100, unit='d')
df = df[df['time'] > shifted]
Test with sample data for 3 days
:
shifted = df.groupby('contract')['time'].transform('max') - pd.Timedelta(3, unit='d')
df = df[df['time'] > shifted]
print (df)
contract time Open High Low Last
3 CME/TYH2018 2017-09-25 125.687500 125.812500 125.687500 125.765625
4 CME/TYH2018 2017-09-26 125.640625 125.796875 125.562500 125.625000
5 CME/TYH2018 2017-09-27 125.171875 125.218750 125.031250 125.125000
373 CME/TYZ2018 2018-07-16 120.000000 120.031250 119.859375 120.000000
374 CME/TYZ2018 2018-07-17 119.968750 120.046875 119.890625 119.953125
375 CME/TYZ2018 2018-07-18 119.875000 120.062500 119.843750 119.890625
Upvotes: 2