Zanam
Zanam

Reputation: 4807

Finding chunks of consecutive non negative in a columns

I have a dataframe which looks as following:

DateTime        Rev
8/9/2019 4:00   -1976
8/9/2019 6:00   -1953
8/9/2019 7:00   -1838
8/9/2019 8:00    0
8/9/2019 11:00   0
8/9/2019 12:00   0
8/9/2019 13:00  -1945
8/9/2019 14:00  -2045
8/9/2019 15:00  -1976
8/9/2019 16:00   0
8/9/2019 17:00   0

I want to find the index of chunks of all non zero data. i.e. I need to extract from above 2 different dataframes

DateTime        Rev
8/9/2019 4:00   -1976
8/9/2019 6:00   -1953
8/9/2019 7:00   -1838

and

DateTime        Rev
8/9/2019 13:00  -1945
8/9/2019 14:00  -2045
8/9/2019 15:00  -1976

I am not sure how to efficiently approach this.

Upvotes: 1

Views: 51

Answers (1)

jezrael
jezrael

Reputation: 862611

Ide is create unique groups for negative concecutive values and in list comprehension create list of DataFrames:

m = df['Rev'].lt(0)
mask = m.ne(m.shift()).cumsum()[m]
dfs = [g for i, g in df.groupby(mask)]
print (dfs)
[        DateTime   Rev
0  8/9/2019 4:00 -1976
1  8/9/2019 6:00 -1953
2  8/9/2019 7:00 -1838,          DateTime   Rev
6  8/9/2019 13:00 -1945
7  8/9/2019 14:00 -2045
8  8/9/2019 15:00 -1976]

print (dfs[0])
        DateTime   Rev
0  8/9/2019 4:00 -1976
1  8/9/2019 6:00 -1953
2  8/9/2019 7:00 -1838

print (dfs[1])
         DateTime   Rev
6  8/9/2019 13:00 -1945
7  8/9/2019 14:00 -2045
8  8/9/2019 15:00 -1976

Detail:

print (mask)
0    1
1    1
2    1
6    3
7    3
8    3
Name: Rev, dtype: int32

Check groups:

df['g'] = mask

print (df)
          DateTime   Rev    g
0    8/9/2019 4:00 -1976  1.0
1    8/9/2019 6:00 -1953  1.0
2    8/9/2019 7:00 -1838  1.0
3    8/9/2019 8:00     0  NaN
4   8/9/2019 11:00     0  NaN
5   8/9/2019 12:00     0  NaN
6   8/9/2019 13:00 -1945  3.0
7   8/9/2019 14:00 -2045  3.0
8   8/9/2019 15:00 -1976  3.0
9   8/9/2019 16:00     0  NaN
10  8/9/2019 17:00     0  NaN

Upvotes: 1

Related Questions