Reputation: 111
I have a sklearn pipeline with two steps (a columntransformer preprocessor with a One hot encoder and a randomforestregressor estimator). I would like to get the feature names of the encoded columns after One hot encoding. My pipeline looks like this.
categorical_preprocessor = OneHotEncoder(handle_unknown="ignore")
# Model processor
preprocessor = ColumnTransformer(
[('categorical', categorical_preprocessor, categorical_columns)], remainder="passthrough")
est = RandomForestRegressor(
n_estimators=100, random_state=0)
pipe = make_pipeline(preprocessor,est)
I am trying to get the feature names of the encoded columns like this:
pipe['preprocessor'].transformers[0][0].get_feature_names(categorical_columns)
But I get an error.
'str' object has no attribute 'get_feature_names'
Upvotes: 0
Views: 44
Reputation: 111
There is apparantly a new feature from scikit-learn 1.0 where we extract the feature names as:
pipeline[:-1].get_feature_names_out()
Upvotes: 0