Mister Jones
Mister Jones

Reputation: 21

Pandas groupby all rows between boolean column values

Looking for the Pandas syntax for grouping rows between column boolean values.

DataTable

distance time new
0 2020-10-11 1:30:15 true
5 2020-10-11 1:31:15 false
10 2020-10-11 1:32:15 false
1 2020-10-11 2:30:15 true
2 2020-10-11 2:31:15 false
30 2020-10-11 2:32:15 false
31 2020-10-11 2:33:15 false
0 2020-10-12 1:31:15 true
5 2020-10-12 1:32:15 false

Question

Looking to group all the rows that are a part of each activity. A new activity is determined by the true value in the new column. Based on the data above there should be 3 groups. Each group needs to start with the first row with True and end with the last False before the next True. How can this be achieved using groupby?

Code

Here is the ability to use groupby and group by the date, but like I said above, I'm looking to change this to group by the "new" column group rows into activities. I'm planning to use in a for loop like this though.

for idx, day in df.groupby(df.timestamp.dt.date):
    print(idx)
    print(day)

Upvotes: 1

Views: 1264

Answers (1)

user17242583
user17242583

Reputation:

Just group by df['new'].cumsum():

for idx, day in df.groupby(df['new'].cumsum()):
    print('-----')
    print(idx)
    print(day)

Output:

-----
1
   distance                time    new
0         0  2020-10-11+1:30:15   True
1         5  2020-10-11+1:31:15  False
2        10  2020-10-11+1:32:15  False
-----
2
   distance                time    new
3         1  2020-10-11+2:30:15   True
4         2  2020-10-11+2:31:15  False
5        30  2020-10-11+2:32:15  False
6        31  2020-10-11+2:33:15  False
-----
3
   distance                time    new
7         0  2020-10-12+1:31:15   True
8         5  2020-10-12+1:32:15  False

Upvotes: 1

Related Questions