detecting value crossing between columns in Pandas

Question

Let's say I have the following dataframe:

df = pd.DataFrame({'a': [10, 20, 30, 40, 50], 'b': [0, 10, 40, 45, 50]}, columns = ['a', 'b'])

I would like to make a list of indices where:

a [i - 1] < b[i] and a[i] >= b[i]

in order to detect when a value, in a timeseries, is crossing another one

is there a Pandas idiomatic way to achieve this without iterating through all the elements?

I tried to create a new column with flags to indicate a crossing by doing this:

df['t'] = (df['a'].shift(1).values < df['b'].values and di['a'].values >= df['b']).astype(bool)

but that won't compile. I'm not sure how to approach this problem, short of doing a loop through all the elements.

Erfan · Accepted Answer

You can use the Series.shift with Series.lt which is "less than", same as < and Series.ge which is "greater than or equal" and is same as >=:

mask = df['a'].shift().lt(df['b']) & df['a'].ge(df['b'])
# same as (df['A'].shift() < df['b']) & (df['a'] >= df['b'])

0    False
1    False
2    False
3    False
4     True
dtype: bool

Notice, we don't have to specify astype(bool), pandas works with boolean indexing and returns booleans when defining conditions.

To get the indices of the rows with True, use:

idx = df[mask].index.tolist()

print(idx)
[4]

detecting value crossing between columns in Pandas

Answers (2)

Related Questions