Reputation: 335
When I try to use SimpleImputer to calculate the missing value, I am getting error like TypeError: unhashable type: 'slice'.
Code I used is pasted below. Please help.
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
# for calculating mean
from sklearn.impute import SimpleImputer
# Read data from the csv file
dataset = pd.read_csv('Data.csv')
#independant vector
independant_matrix = dataset.iloc[:, :-1] # [rows, columns] all the rows, all the columns except the last one.
#depandent matrix, just get the last column. but all rows.
depandent_matrix = dataset.iloc[:, 3] # all the rows and 4th column. index strts at 0.
# take mean of values as the missing data
imputer = SimpleImputer(missing_values = 'NaN', strategy = 'mean')
# upper bound is excluded :(
imputer.fit(independant_matrix[:, 1:3])
independant_matrix[:, 1:3] = imputer.transform(independant_matrix[:, 1:3])
Upvotes: 0
Views: 807
Reputation: 67
independant_matrix is a Dataframe and can't be accessed via slice terminology like independant_matrix[:, 1:3]. You must use iloc indexer(https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.iloc.html#pandas.DataFrame.iloc)
#You must change like this
imputer.fit(independant_matrix.iloc[:, 1:3])
Upvotes: 1