Gabriel Zangerme
Gabriel Zangerme

Reputation: 95

unsupported operand type(s) for +: 'int' and 'str' using Pandas mean

When I try to get the mean of one of my data frame's columns it shows the error:

TypeError: unsupported operand type(s) for +: 'int' and 'str'

Here is the code I have:

import pandas as pd

import numpy as np

url = "https://archive.ics.uci.edu/ml/machine-learning-databases/autos/imports-85.data"

df = pd.read_csv(url, header = None, )

headers = ["symboling","normalized-losses","make","fuel-type","aspiration","num-of-doors","body-style","drive-wheels","engine-location","wheel-base","lenght","width","height","curb-weight","engine-type","num-of-cylinders","engine-size","fuel-system","bore","stroke","compression-ratio","horsepower","peak-rpm","city-mpg","highway-mpg","price"]

df.columns = headers

df.replace('?',np.nan, inplace=True)

mean_val = df['normalized-losses'].mean()

print(mean_val)

Upvotes: 8

Views: 25603

Answers (2)

Nathaniel
Nathaniel

Reputation: 3290

You need to convert the column data type to numeric with pd.to_numeric(). If you use the option errors='coerce' then it will automatically replace non-numeric characters with NaN.

mean_val = pd.to_numeric(df['normalized-losses'], errors='coerce').mean()

print(mean_val)

> 122.0

Upvotes: 10

Jack Moody
Jack Moody

Reputation: 1771

Adding onto Nathaniel's answer, you have a mix of float and str. You can see this if you

print(df['normalized-losses'].apply(type))

Which will return

0      <class 'float'>
1      <class 'float'>
2      <class 'float'>
3        <class 'str'>
4        <class 'str'>

As your error message says, you need to make all of your data of the float type. You can either use pd.to_numeric as Nathaniel suggested or you can alternatively use

df['normalized-losses'] = df['normalized-losses'].astype('float')
mean_val = df['normalized-losses'].mean()
print(mean_val)

Output

122.0

If you are only interested in the normalized-losses column and know that all of your strings can be converted properly (in this case, I believe they can since they are all strings of numbers such as ‘130’), you could just do this. If you are going to use the rest of the data and want to have all numeric strings converted, then use Nathaniel's implementation.

Upvotes: 3

Related Questions