aki
aki

Reputation: 1

Pandas dataframe groupby by day and find first value that exceeds value at fixed time

I have a datetime indexed dataframe with several years of intraday data, in 2 minute increments. I want to group by day and include the first row that exceeds the price at 06:30:00 in each day.

df:

                         Price
2009-10-12 06:30:00      904
2009-10-12 06:32:00      904
2009-10-12 06:34:00      904.5
2009-10-12 06:36:00      905
2009-10-12 06:38:00      905.5
2009-10-13 06:30:00      901
2009-10-13 06:32:00      901
2009-10-13 06:34:00      901
2009-10-13 06:36:00      902
2009-10-13 06:38:00      903

I've tried using .groupby and .apply with a lambda function to group by day and include all rows that exceed the value at 06:30:00, but get an error.

onh = pd.to_datetime('6:30:00').time()
onhBreak = df.groupby(df.index.date).apply(lambda x: x[x > x.loc[onh]])
ValueError: Can only compare identically-labeled Series objects

Desired output:

                    Price
2009-10-12 06:34:00 904.5 
2009-10-13 06:36:00 902

*If these rows are values in a groupby, that would be good also

Any help is appreciated.

Upvotes: 0

Views: 391

Answers (2)

ansev
ansev

Reputation: 30930

We can do:

mask_date = df['Date'].dt.time.gt(pd.to_datetime('06:30:00').time())
df_filtered = df.loc[mask_date.groupby(df['Date'].dt.date).idxmax()]
print(df_filtered)

Output

                 Date  Value
1 2009-10-12 06:32:00  904.0
6 2009-10-13 06:32:00  901.0

Upvotes: 0

BENY
BENY

Reputation: 323366

Here we need groupby with idxmax

df = df.to_frame('value')
df['check'] = df.index.time>onh
subdf = df.loc[df.groupby(df.index.date)['check'].idxmax()]
Out[237]: 
                      value  check
2009-10-12 00:00:00   900.0  False
2020-05-29 13:08:00  3052.0   True
subdf = subdf[subdf['check']]

Upvotes: 0

Related Questions