Reputation: 31
I have this pandas dataframe
interval_mins = {
'10' : 0.11,
'15' : 0.4,
'20' : 0.19
}
pd.DataFrame({
'id' : [10, 15, 20, 10, 20, 15],
'interval' : [0.1, 0.39, 0.2, 0.12, 0.25, 0.42]
})
In pandas DataFrame, I want to select items with interval
values less than interval_mins
for each id
, and then add to the next interval
value of the same id
.
Is there a way without using for
?
Expected output:
pd.DataFrame({
'id' : [10, 15, 20, 10, 20, 15],
'interval' : [0.1, 0.39, 0.2, 0.22, 0.25, 0.81]
})
Upvotes: 1
Views: 457
Reputation: 71689
Let's do:
m = df['interval'] < df['id'].astype(str).map(interval_mins)
df.loc[m.groupby(df['id']).shift(fill_value=False), 'interval'] += df.groupby('id')['interval'].shift()
Details:
Create a boolean mask representing condition where interval
values are less than interval_mins
for each id
:
print(m)
0 True
1 True
2 False
3 False
4 False
5 False
dtype: bool
groupby
the boolean mask m
on id
and shift
downwards:
print(m.groupby(df['id']).shift(fill_value=False))
0 False
1 False
2 False
3 True
4 False
5 True
dtype: bool
groupby
the dataframe on id
and shift
the interval
column:
print(df.groupby('id')['interval'].shift())
0 NaN
1 NaN
2 NaN
3 0.10
4 0.20
5 0.39
Name: interval, dtype: float64
Use boolean indexing with loc
to add the values corresponding to shifted mask:
print(df)
id interval
0 10 0.10
1 15 0.39
2 20 0.20
3 10 0.22
4 20 0.25
5 15 0.81
Upvotes: 2
Reputation: 8219
Consistent with your rule, but not with your output (see my comment), this should do it. I leave intermediate calculations in place so it is easier to understand what's going on. Note I replaced keys in interval_mis
with int
s from str
s. Desired output in column interval_2
df['add'] = df['id'].map(interval_mins)
df['add_cond'] = df['add']*(df['add'] > df['interval'])
df = (df.groupby('id')
.apply(lambda d: d.assign(add_cond_shift=d['add_cond'].shift()))
.fillna(0)
.reset_index(drop = True)
)
df['interval_2'] = df['interval'] + df['add_cond_shift']
df
produces
id interval add add_cond add_cond_shift interval_2
0 10 0.10 0.11 0.11 0.00 0.10
1 10 0.12 0.11 0.00 0.11 0.23
2 15 0.39 0.40 0.40 0.00 0.39
3 15 0.42 0.40 0.00 0.40 0.82
4 20 0.20 0.22 0.22 0.00 0.20
5 20 0.25 0.22 0.00 0.22 0.47
Upvotes: 0