Reputation:
I have csv below
ID,PR_No,PMO,PRO,REV,COST
111,111,AB,MA,2575,2575
111,111,AB,MA,-1137,-1137
112,112,CD,KB,1134,3334
111,111,AB,MA,100,100
Output as below
ID,PR_No,PMO,PRO,REV,COST
111,111,AB,MA,1538,1538
112,112,CD,KB,1134,3334
1538=2575-1137+100
My code is throwing values error
df_n = df.groupby([['ID','PR_No','PMO','PRO']]).agg({'REV':sum,'COST':sum})
Upvotes: 3
Views: 6389
Reputation: 213
Try with some agg or sum function it will work...
df_n = df.groupby([......])[....].sum()
Upvotes: -1
Reputation: 862661
Remove nested []
for list of columns names:
df_n = df.groupby(['ID','PR_No','PMO','PRO']).agg({'REV':sum,'COST':sum})
print (df_n)
REV COST
ID PR_No PMO PRO
111 111 AB MA 1538 1538
112 112 CD KB 1134 3334
Because same aggregate function is possible create list after groupby with .sum
:
df_n = df.groupby(['ID','PR_No','PMO','PRO'])['REV','COST'].sum()
ValueError: Grouper and axis must be same length
What does it mean?
If run with sample data it working, because length of nested list is same like length of sample data:
df_n = df.groupby([['ID','PR_No','PMO','PRO']]).agg({'REV':sum,'COST':sum})
print (df_n)
REV COST
ID 2575 2575
PMO 1134 3334
PRO 100 100
PR_No -1137 -1137
If create list with duplicates it aggregate sample data:
df_n = df.groupby([['ID','ID','PRO','PRO']]).agg({'REV':sum,'COST':sum})
print (df_n)
REV COST
ID 1438 1438
PRO 1234 3434
But if length is different it failed:
print (df)
ID PR_No PMO PRO REV COST
0 111 111 AB MA 2575 2575
1 111 111 AB MA -1137 -1137
2 112 112 CD KB 1134 3334
3 111 111 AB MA 100 100
4 111 111 AB MA 100 100 <- added new row
df_n = df.groupby([['ID','ID','PRO','PRO']]).agg({'REV':sum,'COST':sum})
print (df_n)
ValueError: Grouper and axis must be same length
Upvotes: 5