Reputation: 167
I'm not sure what I'm doing wrong here. This is my code:
df['PV_SUM'] = df.groupby('DOCKET').agg({'PV':sum})
is not returning any results, just an empty series.
This is my hypothetical dataframe:
DOCKET PV
1a 1
1a 1
1a 1
1b 0
1b 1
1b 1
and this is the result i'm looking for:
DOCKET PV PV_SUM
1a 1 3
1a 1 3
1a 1 3
1b 0 2
1b 1 2
1b 1 2
what am i doing wrong? The dtypes for DOCKET is object and the dtype for PV is float. I've changed the dtype to PV to int but no luck.
Upvotes: 1
Views: 1045
Reputation: 4929
Use transform
instead:
df['PV_SUM'] = df.groupby('DOCKET').PV.transform(sum)
Output:
DOCKET PV PV_SUM
0 1a 1 3
1 1a 1 3
2 1a 1 3
3 1b 0 2
4 1b 1 2
5 1b 1 2
The issue with your code is that df.groupby('DOCKET').agg({'PV':sum})
returns a dataframe with DOCKET
as index and PV
as value column. When you try assigning it back to the daframe, pandas looks for matching indexes, and, since there are no matchs, it returns NaN
.
For example, take a look at the output from df.groupby('DOCKET').agg({'PV':sum})
:
PV
DOCKET
1a 3
1b 2
As pandas matches the index, you could first set the index of your dataframe to "DOCKET", then it will work as expected:
result = df.groupby('DOCKET').agg({'PV':sum})
df = df.set_index('DOCKET')
df['PV_SUM'] = result
Upvotes: 2