makt
makt

Reputation: 107

Getting Error of using OneHotEncoder and SimpleImputer in a Single Pipeline

how can I actually proceed it in a single pipeline, is there any value missing or wrongly defined something.

 #instantiate
imputer = SimpleImputer()
ohe = OneHotEncoder(use_cat_names=True)

#fit
imputer.fit(X_train)
ohe.fit(X_train)

#transform
XT_train = imputer.transform(X_train["lat","lon"])
XT_train = ohe.transform(X_train["neighborhood"])



model = make_pipeline(
    SimpleImputer(),
    OneHotEncoder(use_cat_names=True),
    Ridge()
)
model.fit(X_train, y_train)

enter image description here

Error I found in the console like

Upvotes: 0

Views: 323

Answers (2)

user15316630
user15316630

Reputation: 33

Yes its magical under the hood, but OHE has to come before SimpleImputer. If you start the pipeline with SimpleImputer, you get the error as you did, I did that too. But changing the order solved the issue, and here is the pipeline

 Pipeline

OneHotEncoder
OneHotEncoder(cols=['neighborhood'], use_cat_names=True)

SimpleImputer
SimpleImputer()

Ridge
Ridge()

Upvotes: 0

makt
makt

Reputation: 107

#instantiate
imputer = SimpleImputer()
ohe = OneHotEncoder(use_cat_names=True)

#fit
imputer.fit(X_train)
ohe.fit(X_train)

#transform
XT_train = imputer.transform(X_train["lat","lon"])
XT_train = ohe.transform(X_train["neighborhood"])

Remove All above lines of code. Because, OneHotEncoder Automatically detect categorical data in the feature matrix, like this is true for also SimpleImputer ->> it can identify numerical NAN values and then fill it.

# Build Model
model = make_pipeline(
    OneHotEncoder(use_cat_names=True),
    SimpleImputer(),
    Ridge()
)
# Fit model
model.fit(X_train, y_train)

Upvotes: 1

Related Questions