Reputation: 8033
I have a dataframe as below
id Supply days days_180
1 30 0 180
1 100 183 363
1 80 250 430
2 5 0 180
2 5 10 190
3 5 0 180
3 30 100 280
3 30 150 330
3 30 200 380
3 30 280 460
3 50 310 490
I want to sum 'Supply' where days are between 'days' & 'days+180' for each row. This needs to be done for each group after groupby('id').
The expected output is as below
id Supply days days_180 use
1 30 0 180 30
1 100 183 363 180
1 80 250 430 80
2 5 0 180 10
2 5 10 190 10
3 5 0 180 65
3 30 100 280 120
3 30 150 330 140
3 30 200 380 110
3 30 280 460 80
3 50 310 490 50
I have tried the code below, but it is not working as intended.
df_d['use']=df_d.groupby('id').apply(lambda x: x.loc[x['days'].between(x['days'],x['days_180']),'supply'].sum())
Upvotes: 2
Views: 42
Reputation: 5451
you can also do it using numpy's broadcast and np.where
df.groupby("id").apply(
lambda g: g.assign(use=(np.where((g.days.values>=g.days.values[:, np.newaxis]) &
(g.days.values<=g.days_180.values[:, np.newaxis]), g.Supply, 0).sum(axis=1)))
).reset_index(drop=True)
also jezrael's can be modified like below to use lambda in side apply
df = df.groupby('id').apply(lambda x:
x.assign(use=[x.loc[(x['days'] <= d) & (x['days_180'] >= d),'Supply'].sum() for d in x['days_180']])
)
print (df)
Upvotes: 0
Reputation: 862581
Use list comprehension for loop each days_180
values per groups, filter with sum
and create new column:
def f(x):
a = [x.loc[(x['days'] <= d) & (x['days_180'] >= d),'Supply'].sum() for d in x['days_180']]
x['use'] = a
return x
Or solution with another lambda:
def f(x):
x['use'] = x['days_180'].apply(lambda d: x.loc[(x['days'] <= d) &
(x['days_180'] >= d), 'Supply'].sum())
return x
df_d = df_d.groupby('id').apply(f)
print (df_d)
id Supply days days_180 use
0 1 30 0 180 30
1 1 100 183 363 180
2 1 80 250 430 80
3 2 5 0 180 10
4 2 5 10 190 5
5 3 5 0 180 65
6 3 30 100 280 120
7 3 30 150 330 140
8 3 30 200 380 110
9 3 30 280 460 80
10 3 50 310 490 50
Upvotes: 2