Obiii
Obiii

Reputation: 834

Getting feature names and coefficients from lasso regression in sklearn pipeline

I have a pipeline that uses custom transformers as well.

Here is what the pipeline looks like:

feature_cleaner =  Pipeline(steps=[
        ("id_col_remover", columnDropperTransformer(id_cols)),
        ("missing_remover", columnDropperTransformer(miss_cols))
    ])

    zero_Setter = Pipeline(steps=[
        ("zero_imp", ZeroImputer(fill_zero_cols))
    ])

    numeric_transformer = Pipeline(steps=[
        ('imputer', SimpleImputer(strategy = "constant", fill_value=-1, add_indicator=True)),
        ('scaler', StandardScaler()),
        ("variance_selector", VarianceThreshold(threshold=0.03))
    ])

    categotical_binary_transformer = Pipeline(steps=[
        ('imputer', SimpleImputer(strategy = "constant", fill_value=-1, add_indicator=True)),
        ('encode', OneHotEncoder(handle_unknown='ignore'))
    ])

    preprocess_ppl = ColumnTransformer(
        transformers=[
            ('numeric', numeric_transformer, make_column_selector(dtype_include=np.number)),
            ('categorical_binary', categotical_binary_transformer, cat_features)
        ], remainder='drop'
    )
    steps=[
            ('zero_imputer', zero_Setter),
            ('cleaner', feature_cleaner),
            ("preprocessor", preprocess_ppl),
            ("estimator", linear_model.Lasso())
        ]

    pipeline = Pipeline(
        steps=steps
    )

I train the pipeline and save it as joblib, in other notebook I load the pipeline from the joblib file.

The question is how do I get the coefficients of each feature that the lasso used i.e feature names and their coefficients.

Upvotes: 0

Views: 1128

Answers (1)

Ben Reiniger
Ben Reiniger

Reputation: 12698

If you're using the newest sklearn and have implemented get_feature_names_out for your custom transformers, then

zip(
    pipe.get_feature_names_out(),
    pipe[-1].coef_
)

should work.

Upvotes: 0

Related Questions