j_d
j_d

Reputation: 99

Adding up numbers in a column in Python doesn't work properly?

I have a dataframe (called df) that currently looks like this:

    Date    Amount
01/11/2019  -0.4
01/11/2019  -15.81
01/11/2019  -21.98
31/10/2019  -5.27
30/10/2019  -1.5
30/10/2019  -20
30/10/2019  -5,000

I would like to sum the column "Amount" up. To do so, I have taken the following steps:

df['Amount'] = df['Amount'].str.replace(',', '')
pd.to_numeric(df['Amount'])
df['Amount'].sum()

However, when I try to sum it, I get a string, even though the column "Amount" is clearly a float:

'-0.4-15.81-21.98-5.27-1.5-20-5000'

Does anyone have any advice on how to solve this? I've been stuck on this for a while!

Thank you!

Upvotes: 3

Views: 498

Answers (4)

MEdwin
MEdwin

Reputation: 2960

There is actually a thousand argument that can help you convert all the values into numeric. see a mockup below. Let me know if it works.

from StringIO import StringIO
Mydata = StringIO("""Date  Amount
01/11/2019  -0.4
01/11/2019  -15.81
01/11/2019  -21.98
31/10/2019  -5.27
30/10/2019  -1.5
30/10/2019  -20
30/10/2019  -5,000
    """)

df = pd.read_csv(Mydata, sep="  ",engine='python', thousands=',')

df

result below:

Date    Amount
0   01/11/2019  -0.40
1   01/11/2019  -15.81
2   01/11/2019  -21.98
3   31/10/2019  -5.27
4   30/10/2019  -1.50
5   30/10/2019  -20.00
6   30/10/2019  -5000.00

Upvotes: 0

joseph praful
joseph praful

Reputation: 173

When you do pd.to_numeric(df['Amount']), it converts the column 'Amount' to numeric, but does not replace the values in the actual column. The modified (or converted) column is stored in the '_' variable.

You need to include df['Amount'] = pd.to_numeric(df['Amount']) to replace the actual column in the dataframe.

Upvotes: 0

Omrum Cetin
Omrum Cetin

Reputation: 1469

Use direct sum operation of what pandas offer. Axis is showing column index.

df.sum(axis = 1, skipna = True) 

skipnais for skip NaN columns.

Upvotes: 0

prp
prp

Reputation: 962

You are almost there, only need to change this line:

df['Amount'] = pd.to_numeric(df['Amount'])

Upvotes: 2

Related Questions