Reputation: 2201
I have a pd.DataFrame like below,
Hr Name Count Day
6 1318 10.0 Friday
7 1318 20.0 Friday
8 1318 2.0 Friday
9 1318 18.0 Friday
6 1318 2.0 Monday
7 1318 15.0 Monday
8 1318 2.0 Monday
9 1318 5.0 Monday
6 1319 20.0 Friday
7 1319 30.0 Friday
8 1319 50.0 Friday
9 1319 5.0 Friday
6 1319 3.0 Monday
7 1319 30.0 Monday
8 1319 2.0 Monday
9 1319 5.0 Monday
I want to iterate each Count and sum Name wise, then make 1 if the value is >=20. finally count the values how many times reached above >=20. When sum of Count reached >=20 then next value should be counted with actual value.
Expected OP :
[{'Friday' :[2,3],'Monday':[1,1]}]
Here is How to Do for Friday:
1318Friday[10+20=30 30 >=20 so 1, 2+18 =20 again 1, Total is 2]
1319Friday[20 20 >=20 so 1, 30 again 1,50 again 1,5 is !>=20, Total is 3]
Finally {'Friday' : [2,3]}
Code I have tried :
finalresult = [data]
df = pd.DataFrame(finalresult)
df['csum'] = df.groupby(['Day','Name'])['Count'].cumsum()
Upvotes: 4
Views: 500
Reputation: 76336
First create a running count of the days:
df['running'] = (df.Day != df.Day.shift(1)).cumsum()
>>> df
Hr Name Count Day running
0 6 1318 10.0 Friday 1
1 7 1318 20.0 Friday 1
2 8 1318 2.0 Friday 1
3 9 1318 18.0 Friday 1
4 6 1318 2.0 Monday 2
5 7 1318 15.0 Monday 2
6 8 1318 2.0 Monday 2
7 9 1318 5.0 Monday 2
8 6 1319 20.0 Friday 3
9 7 1319 30.0 Friday 3
10 8 1319 50.0 Friday 3
11 9 1319 5.0 Friday 3
12 6 1319 3.0 Monday 4
13 7 1319 30.0 Monday 4
14 8 1319 2.0 Monday 4
15 9 1319 5.0 Monday 4
This will allow you to group by it.
Now (unfortunately) create a function to count the counts of over 20:
def count_counts(g):
s, c = 0, 0
for e in g:
s += e
if s > 20:
s, c = 0, c + 1
return c
group by days and runnings and apply it, then group by days and list the results:
cc = df.Count.groupby([df.Day, df.running]).apply(count_counts).reset_index()
>>> cc.Count.groupby(cc.Day).apply(list)
Day
Friday [1, 2]
Monday [1, 1]
Name: Count, dtype: object
Upvotes: 1
Reputation: 21729
You can do this by apply
a custom function to the grouped data.
## custom function
def get_cumsum(x):
adds = 0
counter = 0
for i in x:
counter += i
if counter >= 20:
adds += 1
counter=0
return adds
## apply the function to dataframe
df1 = df.groupby(['Name','Day'])['Count'].apply(get_sum).reset_index()
## get dict
df1.groupby('Day')['Count'].apply(list).to_dict()
{'Friday': [2, 3], 'Monday': [1, 1]}
Explanation:
Upvotes: 3