Borut Flis
Borut Flis

Reputation: 16375

How to get the names of the new columns after performing sklearn Column Transformer

preprocessor = ColumnTransformer(
    [
        ('num', StandardScaler(), numeric_features),
        ('cat', OneHotEncoder(handle_unknown='ignore'), categorical_features)
    ]
)

I want to perform transformations on both some numeric attributes and also on some categorical features.

Running: test=preprocessor.fit_transform(X_train) return a numpy array, which does not have names of columns.

According to documentation the ColumnTransformer should have function get_feature_names(),which would return the names of the new features. However when I run it I get:

AttributeError: Transformer num (type StandardScaler) does not provide get_feature_names.

I want to get the names of the columns dynamically because I don't know the number of categories in advance.

Upvotes: 0

Views: 300

Answers (1)

Celius Stingher
Celius Stingher

Reputation: 18367

ColumnTransformer takes the column in the same order they are defined in your dataframe, therefore you may consider obtaining them with pandas select_dtypes from your dataframe. Supposing your data is contained in a df:

numeric_columns = list(df.select_dtypes('number'))
categorical_columns = list(df.select_dtypes('object')) + list(df.select_dtyes('category'))

Upvotes: 1

Related Questions