Reputation: 95
When I try to get the mean of one of my data frame's columns it shows the error:
TypeError: unsupported operand type(s) for +: 'int' and 'str'
Here is the code I have:
import pandas as pd
import numpy as np
url = "https://archive.ics.uci.edu/ml/machine-learning-databases/autos/imports-85.data"
df = pd.read_csv(url, header = None, )
headers = ["symboling","normalized-losses","make","fuel-type","aspiration","num-of-doors","body-style","drive-wheels","engine-location","wheel-base","lenght","width","height","curb-weight","engine-type","num-of-cylinders","engine-size","fuel-system","bore","stroke","compression-ratio","horsepower","peak-rpm","city-mpg","highway-mpg","price"]
df.columns = headers
df.replace('?',np.nan, inplace=True)
mean_val = df['normalized-losses'].mean()
print(mean_val)
Upvotes: 8
Views: 25603
Reputation: 3290
You need to convert the column data type to numeric with pd.to_numeric()
. If you use the option errors='coerce'
then it will automatically replace non-numeric characters with NaN
.
mean_val = pd.to_numeric(df['normalized-losses'], errors='coerce').mean()
print(mean_val)
> 122.0
Upvotes: 10
Reputation: 1771
Adding onto Nathaniel's answer, you have a mix of float
and str
. You can see this if you
print(df['normalized-losses'].apply(type))
Which will return
0 <class 'float'>
1 <class 'float'>
2 <class 'float'>
3 <class 'str'>
4 <class 'str'>
As your error message says, you need to make all of your data of the float
type. You can either use pd.to_numeric
as Nathaniel suggested or you can alternatively use
df['normalized-losses'] = df['normalized-losses'].astype('float')
mean_val = df['normalized-losses'].mean()
print(mean_val)
Output
122.0
If you are only interested in the normalized-losses column and know that all of your strings can be converted properly (in this case, I believe they can since they are all strings of numbers such as ‘130’), you could just do this. If you are going to use the rest of the data and want to have all numeric strings converted, then use Nathaniel's implementation.
Upvotes: 3