JejeBelfort
JejeBelfort

Reputation: 1663

Apply function to a range of specific rows

I have the following dataframe df:

            bucket_value  is_new_bucket
dates                                  
2019-03-07             0              1
2019-03-08             1              0
2019-03-09             2              0
2019-03-10             3              0
2019-03-11             4              0
2019-03-12             5              1
2019-03-13             6              0
2019-03-14             7              1

I want to apply a specific function (let’s say the mean function) to each bucket_value data groups where the column is_new_bucket is equal to zero, such that the resulting dataframe would look like this:

            mean_values
dates             
2019-03-08     2.5
2019-03-13     6.0

In other words, applying a function to the consecutive rows where is_new_bucket = 0, which takes the bucket_value as input.

For instance, if I want to apply the max function, the resulting dataframe would look like this:

            max_values
dates             
2019-03-11     4.0
2019-03-13     6.0

Upvotes: 1

Views: 219

Answers (1)

BENY
BENY

Reputation: 323226

Using cumsum with filter

df.reset_index(inplace=True)
s=df.loc[df.is_new_bucket==0].groupby(df.is_new_bucket.cumsum()).agg({'date':'first','bucket_value':['mean','max']})
s
                    date bucket_value    
                   first         mean max
is_new_bucket                            
1             2019-03-08          2.5   4
2             2019-03-13          6.0   6

Updated

df.loc[df.loc[df.is_new_bucket==0].groupby(df.is_new_bucket.cumsum())['bucket_value'].idxmax()]
        date  bucket_value  is_new_bucket
4 2019-03-11             4              0
6 2019-03-13             6              0

Updated2 after using the cumsum create the group key Newkey , you can do whatever you need , base on the groupkey

df['Newkey']=df.is_new_bucket.cumsum()
df
        date  bucket_value  is_new_bucket  Newkey
0 2019-03-07             0              1       1
1 2019-03-08             1              0       1
2 2019-03-09             2              0       1
3 2019-03-10             3              0       1
4 2019-03-11             4              0       1
5 2019-03-12             5              1       2
6 2019-03-13             6              0       2
7 2019-03-14             7              1       3

Upvotes: 2

Related Questions