Python Pandas groupby: filter and apply according to condition on values

Question

I have this pandas dataframe

interval_mins = {
    '10' : 0.11,
    '15' : 0.4,
    '20' : 0.19
}
pd.DataFrame({
    'id' : [10, 15, 20, 10, 20, 15],
    'interval' : [0.1, 0.39, 0.2, 0.12, 0.25, 0.42]
})

In pandas DataFrame, I want to select items with interval values less than interval_mins for each id, and then add to the next interval value of the same id.

Is there a way without using for?

Expected output:

pd.DataFrame({
    'id' : [10, 15, 20, 10, 20, 15],
    'interval' : [0.1, 0.39, 0.2, 0.22, 0.25, 0.81]
})

Shubham Sharma · Accepted Answer

Let's do:

m = df['interval'] < df['id'].astype(str).map(interval_mins)
df.loc[m.groupby(df['id']).shift(fill_value=False), 'interval'] += df.groupby('id')['interval'].shift()

Details:

Create a boolean mask representing condition where interval values are less than interval_mins for each id:

print(m)

0     True
1     True
2    False
3    False
4    False
5    False
dtype: bool

groupby the boolean mask m on id and shift downwards:

print(m.groupby(df['id']).shift(fill_value=False))

0    False
1    False
2    False
3     True
4    False
5     True
dtype: bool

groupby the dataframe on id and shift the interval column:

print(df.groupby('id')['interval'].shift())

0     NaN
1     NaN
2     NaN
3    0.10
4    0.20
5    0.39
Name: interval, dtype: float64

Use boolean indexing with loc to add the values corresponding to shifted mask:

print(df)

   id  interval
0  10      0.10
1  15      0.39
2  20      0.20
3  10      0.22
4  20      0.25
5  15      0.81

Python Pandas groupby: filter and apply according to condition on values

Answers (2)

Related Questions