Reputation: 1154
Here is my problem :
In my dataframe i have two cols : km and value
km value
0 1 10
1 10 2
2 3 5
and i would like to make sum of value from range of km like a sum of value for 1 < km =< 3 then 3 < km =< 6 etc ...
i've tried this :
i = 0
j = 3
while j < 200 :
sum_km = dvf[i < dvf['km'] and j >= dvf['km'] ]['value'].sum()
i += 3
j += 3
but i have :
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
I would like to have at the end :
Total value for 1 at 3 kms : 15
Total value for 3 at 6 kms : 0
Total value for 6 at 9 kms : 0
Total value for 9 at 12 kms : 2
Upvotes: 0
Views: 76
Reputation: 30920
Use pd.interval_range
+ pd.cut
:
bins=pd.interval_range(df['km'].min()-1, 12, freq=3)
df.groupby(pd.cut(df['km'],bins)).value.sum()
km
(0, 3] 15
(3, 6] 0
(6, 9] 0
(9, 12] 2
Name: value, dtype: int64
Upvotes: -1
Reputation: 1604
If you want all intervals:
width = 3
min_val = dvf['km'].min() - 1
max_val = dvf['km'].max()
bins = pd.IntervalIndex.from_tuples([(x, x+width) for x in range(min_val, max_val, width)])
dvf.groupby(pd.cut(dvf['km'], bins = bins))['value'].agg('sum')
>>> output
km
(0, 3] 15
(3, 6] 0
(6, 9] 0
(9, 12] 2
I'll leave the pint formatting to you as that seems to be besides the main problem.
Upvotes: 2
Reputation: 323306
My solution pd.cut
df.value.groupby(pd.cut(df.km,[0,3,6,9,12])).sum()
Out[400]:
km
(0, 3] 15
(3, 6] 0
(6, 9] 0
(9, 12] 2
Name: value, dtype: int64
Upvotes: 1
Reputation: 322
My solution is similar to ansev's except I interpreted your grouping a little differently.
df['bin'] = (df['km']/3).astype(int)
df[['bin','value']].groupby('bin').agg(sum)
Upvotes: 1
Reputation: 908
You need boolean operators:
dvf[(i < dvf['km']) & (j >= dvf['km'])]['value'].sum()
Upvotes: -2