Furkan Karacan
Furkan Karacan

Reputation: 192

Python - How to get a column's mean if there is String value too

I am new to python. I have a .csv dataset. There is a column called BasePay.

Most of the values in column is type int, but some values are "Not Provided".

I am trying to get mean value of BasePay as:

sal['BasePay'].mean()

But it gives me error of :

TypeError: can only concatenate str (not "int") to str.

I want to omit that string columns. How can i do that?

Thanks.

Upvotes: 1

Views: 2927

Answers (3)

Ole Kristian
Ole Kristian

Reputation: 61

If you store data from the BasePay column in a list, you can do as follows:

for i in l:
if type(i) == int:
    x.append(i)

mean = sum(x) / len(x)
print(mean)

Upvotes: 1

jezrael
jezrael

Reputation: 862601

Because some non numeric values use to_numeric with errors='coerce' for convert them to NaNs, so mean working nice:

out = pd.to_numeric(sal['BasePay'], errors='coerce').mean()

Sample:

sal = pd.DataFrame({'BasePay':[1, 'Not Provided', 2, 3, 'Not Provided']})
print (sal)
        BasePay
0             1
1  Not Provided
2             2
3             3
4  Not Provided

print (pd.to_numeric(sal['BasePay'], errors='coerce'))
0    1.0
1    NaN
2    2.0
3    3.0
4    NaN
Name: BasePay, dtype: float64

out = pd.to_numeric(sal['BasePay'], errors='coerce').mean()
print (out)
2.0

Upvotes: 4

user11230797
user11230797

Reputation: 11

This problem is because, when you import the dataset, the empty fields will be filled with NaN(pandas), So you have two options 1.Either you convert pandas.nan to 0 or remove the NaN's, by drop.nan

This can also be achieved by using np.nanmean()

Upvotes: 1

Related Questions