Reputation: 1684
I have a dataset of about 200 countries (rows) for different time periods (columns). The Pandas dataframe of this dataset is as follows:
data = {'Country': ['Afghanistan', 'Albania', 'Algeria', 'Andorra', 'Angola'],
'1958-1962': [0, 0, 0, 0, 0],
'2008-2012': [0.0, 0.0, 8.425, 0.0, 0.0],
'2013-2017': [0.0, 0.0, 10.46, 0.0, 0.0]}
df = pd.DataFrame(data)
Country 1958-1962 2008-2012 2013-2017
Afghanistan 0 0.000 0.00
Albania 0 0.000 0.00
Algeria 0 8.425 10.46
Andorra 0 0.000 0.00
Angola 0 0.000 0.00
I am trying to obtain sum of all the values in each column using the following code.
y_data = []
period_list = list(df)
period_list.remove('Country')
for x in period_list:
y_data.append(df[x].sum())
TypeError: unsupported operand type(s) for +: 'int' and 'str'
Process finished with exit code 1
For some reason, Pandas is also including the header in the sum process. How do I resolve this?
I tested the sum function on the following dataframe using df.sum()
, and it appropriately produced the sum of numbers for each column as 18, 20, 20, 19.
df = pd.DataFrame({"A":[5, 3, 6, 4],
"B":[11, 2, 4, 3],
"C":[4, 3, 8, 5],
"D":[5, 4, 2, 8]})
The output of print(df.drop("Country",axis=1).dtypes)
is as follows:
1958-1962 object
1963-1967 object
1968-1972 object
1973-1977 object
1978-1982 object
1983-1987 object
1988-1992 object
1993-1997 object
1998-2002 object
2003-2007 object
2008-2012 object
2013-2017 object
dtype: object
I used df = df.apply(pd.to_numeric, errors='ignore')
to convert the objects into numbers and that resolved the issue.
Upvotes: 0
Views: 1112
Reputation:
Convert the columns you want to sum from objects to numeric and then drop Country column before making sum in the rest of columns.
Refer this link for converting from object to numeric
Upvotes: 1