Scikit-learn: error in replacing missing data

Question

I am trying to preprocess my data by replacing the missing value by the mean.

My code is as follows:

#Load the Data 
import numpy as np
data_2 = np.genfromtxt('data.csv', delimiter=',', skip_header=1)

#the missing values in my dataset are identified by value = 0 
#I'm trying to replace the missing values in the third column 
from sklearn.preprocessing import Imputer 
imp = Imputer(missing_values=0, strategy='mean', axis=0)
imp.fit(data_2[:, 2])

it runs but gave these warnings:

/Users/user1/anaconda/lib/python2.7/site-packages/sklearn/utils/validation.py:386: DeprecationWarning: Passing 1d arrays as data is deprecated in 0.17 and willraise ValueError in 0.19. Reshape your data either using X.reshape(-1, 1) if your data has a single feature or X.reshape(1, -1) if it contains a single sample.
  DeprecationWarning)

/Users/user1/anaconda/lib/python2.7/site-packages/sklearn/utils/validation.py:386: DeprecationWarning: Passing 1d arrays as data is deprecated in 0.17 and willraise ValueError in 0.19. Reshape your data either using X.reshape(-1, 1) if your data has a single feature or X.reshape(1, -1) if it contains a single sample.
  DeprecationWarning)

but my main problem is that it did not fill the missing data, I printed the data before and after the fitting and no change.

What's the thing I'm doing wrong?

Update: Here is few lines of my dataset:
6,148,72,35,0,33.6,0.627,50,1
1,85,66,29,0,26.6,0.351,31,0
8,183,64,0,0,23.3,0.672,32,1
1,89,66,23,94,28.1,0.167,21,0

Scikit-learn: error in replacing missing data

Answers (1)

Related Questions