Finger twist
Finger twist

Reputation: 3786

Nicest way to check if a value crosses a threshold within a time frame

Given a pandas Dataframe from csv looking like this:

DATE,DESCRIPTION,SELL,BUY:
2014-01-16 15:25:57.817728,"text1",7,10
2014-01-16 15:26:01.445759,"text2",5,8
2014-01-16 15:26:57.856787,"text3",4,10
...

What would be the nicest way to check if a value in the BUY column crosses below a certain threshold within a given time frame, say, for the sake of example, between .minute 40 and .minute 50 and then return the row number ? (only if a cross happens, not if if value is already below a certain threshold befor the given time)

Upvotes: 3

Views: 3763

Answers (2)

hallwyatt
hallwyatt

Reputation: 3

This may be irrelevant for your situation, but I also was trying to figure out how to return True when a price crosses a threshold. In my case, I was trying to figure out how to return True any time the price of Bitcoin crossed the next $1,000 mark. This is what I came up with.

import math

rate = price() #fetches the price of bitcoin

def round_to_lowest_thousand(number):
    return 1000*math.floor(number/1000)

def cross_threshold():
    global rate
    sleep(2)
    rate2 = price()
    if round_to_thousand(rate) != round_to_thousand(rate2):
        return True 
        rate = rate2
    else:
        rate = rate2

cross_threshold()

Hopefully this helps

Upvotes: 0

Yanshuai Cao
Yanshuai Cao

Reputation: 1297

Assuming that your rows are sorted in increasing time order, like in your example (and 'DATE' is a data column not index), then here's one way to achieve what you want:

To cross below a THRESHOLD at row t means that BUY < threshold at t and BUY >= THRESHOLD at t-1. Suppose that D is your dataframe, this is achieved by condition:

cond = (D['BUY'] < threshold) & (D['BUY'].shift(1) >= threshold).

To get all crossing time row indices between START_TIME and END_TIME:

d = D[cond]

cross_row_inds = d[(d['DATE'] >= START_TIME) & (d['DATE'] <= END_TIME)].index

You could also select time first, which might be slightly faster if DATE was a sorted datetime index instead of a data column. But there, depending on your semantics, you might need to include one time instance right before the START_TIME.

Upvotes: 4

Related Questions