Zarathustra
Zarathustra

Reputation: 145

Find consecutive elements satisfying conditions pandas

I've got a simple dataframe with the columns 'Quarter' and 'Growth', and I want to find the first instance of when too consecutive entries are negative.

df = pd.DataFrame({
'Quarters': ['2000q1','2000q2','2000q3','2OOOq4','2001q1','2001q2','2001q3'],
'Growth': [np.nan,10,20,-5,-6,10,-8]})

(The first entry is NAN because 'Growth' comes from calculating the difference between entries).

I want the code to output '2001q1', the first occurence of two consecutive negative growths. Thanks!

Upvotes: 0

Views: 311

Answers (3)

wwnde
wwnde

Reputation: 26676

Chain DataFrame.shift¶ and DataFrame.le to boolean select within datframe

 a,*b,= df.loc[df.Growth.le(0)&df.Growth.shift().le(0),'Quarters'].values.tolist()
print(a)

Explanation

df.loc[df.Growth.le(0)&df.Growth.shift().le(0),'Quarters'].values.tolist()#Boolean select consecutive negative numbers in column Quaters into a list
 a,*b,=x# Unpack list x giving the first element

'2001q1'

Upvotes: 1

Mehul Gupta
Mehul Gupta

Reputation: 1939

import numpy as np
import pandas as pd
df = pd.DataFrame({
    'Quarters' :['2000q1','2000q2','2000q3','2OOOq4','2001q1','2001q2','2001q3'],
    'Growth' : [np.nan,10,20,-5,-6,10,-8]})
flag=0
for index, rows in df.iterrows():
     if rows['Growth']<0 and flag<0:
         print(rows['Quarters'])
         break
     flag=rows['Growth']

Explanation: using iterrows(), we can loop over a dataframe. 'flag' variable helps us to store ''Growth value for 'Quarter' just before the current value.

Upvotes: 1

jezrael
jezrael

Reputation: 862611

If always exist at least one pair chain 2 masks by & for bitwise AND, second is same like first only shifted values by Series.shift, then get first True by Series.idxmax and select by DataFrame.loc:

a = df.loc[(df['Growth'].lt(0) & df['Growth'].shift().lt(0)).idxmax(), 'Quarters']
print (a)
2001q1

If not sure if exist 1 pair is possible use next with iter for possible specify default value if no match:

mask = (df['Growth'].lt(0) & df['Growth'].shift().lt(0))
a =  next(iter(df.loc[mask, 'Quarters']), 'no exist 2 negative values')
print (a)
2001q1

Changed data sample:

df = pd.DataFrame({
'Quarters' : ['2000q1','2000q2','2000q3','2004q4','2001q1','2001q2','2001q3'],
'Growth' : [np.nan,10,20,5,-6,10,-8]})

print (df)
  Quarters  Growth
0   2000q1     NaN
1   2000q2    10.0
2   2000q3    20.0
3   2004q4     5.0
4   2001q1    -6.0
5   2001q2    10.0
6   2001q3    -8.0

mask = (df['Growth'].lt(0) & df['Growth'].shift().lt(0))
a =  next(iter(df.loc[mask, 'Quarters']), 'no exist 2 negative consecutive values')
print (a)
no exist 2 negative consecutive values

Upvotes: 3

Related Questions