Reputation: 23
I wanted to make a simple python program with pandas
that can help me count how many a person did something cumulatively with data gathered from converted html file to excel file. Here is my data sample:
Name Date Minutes
foo 1/12/2000 100
foo 1/12/2000 75
foo 1/12/2020 10
foo 1/13/2020 50
bar 1/13/2020 25
bar 1/14/2020 120
I then tried using groupby(["Name", "Date", "Minutes"]).sum()
function, with my expected result is:
Name Date Minutes
foo 1/12/2020 185
1/13/2020 50
bar 1/13/2020 25
1/14/2020 120
but instead i get:
Name Date Minutes
foo 1/12/2020 100
75
10
1/13/2020 50
bar 1/13/2020 25
1/14/2020 120
I tried to google my problem first and i come across this thread but somehow the result is different. I also tried to use agg
, and changing the Minutes
datatype to int64
but the result is the same. Any help is really appreciated.
Upvotes: 0
Views: 242
Reputation: 25190
If you want to sum the Minutes column, don't include it in the groupby. Including it in the groupby means that columns with different values of Minutes should go into different groups.
Here's how to add up the Minutes for rows with the same Name and Date.
>>> df
Name Date Minutes
0 foo 1/12/2000 100
1 foo 1/12/2000 75
2 foo 1/12/2020 10
3 foo 1/13/2020 50
4 bar 1/13/2020 25
5 bar 1/14/2020 120
>>> df.groupby(["Name", "Date"]).sum()
Minutes
Name Date
bar 1/13/2020 25
1/14/2020 120
foo 1/12/2000 175
1/12/2020 10
1/13/2020 50
Upvotes: 0