Reputation: 65
When I use the following code:
print(self.df.groupby(by=[2])[3].agg(['sum']))
On the following Dataframe:
0 1 2 3 4 5 6 7
0 15 LCU Test 1 308.02 170703 ALCU 4868 MS10
1 16 LCU Test 2 127.37 170703 ALCU 4868 MS10
The sum function is not completed correctly because the value column (col 3) returns a concatenated string of the values (308.02127.37) instead of maintaining the integrity of the individual values to allow operation.
Upvotes: 4
Views: 6968
Reputation: 402353
It seems like your 3rd column is a string. Did you load in your dataframe using dtype=str
?
Furthermore, try not to hardcode your columns. You can use .astype
or pd.to_numeric
to cast and then apply sum
:
self.df.groupby(self.df.columns[2])[self.df.columns[3]].agg(
lambda x: pd.to_numeric(x, errors='coerce').sum()
)
Or
self.df.groupby(self.df.columns[2])[self.df.columns[3]].agg(
lambda x: x.astype(float).sum()
)
Upvotes: 11