mrgoldtech
mrgoldtech

Reputation: 73

Accessing attributes in sklearn pipeline

I'm having trouble accessing attributes of intermediate steps in my sklearn pipeline. Here's my code:

from sklearn.pipeline import make_pipeline, make_union
from sklearn.compose import make_column_transformer
from sklearn.impute import SimpleImputer
from sklearn.preprocessing import StandardScaler, PowerTransformer, OneHotEncoder

categorical_pipeline = make_pipeline(
                    SimpleImputer(strategy='constant', fill_value='None'),
                    OneHotEncoder(sparse=False))

ratings_pipeline = make_pipeline(
                RatingEncoder(), 
                StandardScaler(), 
                PowerTransformer(method='yeo-johnson'))

numeric_pipeline = make_pipeline(
                SimpleImputer(strategy='constant', fill_value=0),
                StandardScaler(),
                PowerTransformer(method='yeo-johnson'))

preprocess = make_pipeline(
    make_union(  
        # Select all categorical features and impute NA values into a unique category
        make_column_transformer(
            (categorical_pipeline, select_categorical_features),
            remainder='drop'
        ),      
        # Select all rating-encoded features and convert them to numerical, apply Scaling+PowerTransform
        make_column_transformer(
            (ratings_pipeline, select_rated_features),
            remainder='drop'
        ),   
        # Select all numeric features and impute, Scale+PowerTransform
        make_column_transformer(
            (numeric_pipeline, select_numeric_features),
            remainder='drop'
        ),     
    )
)

I know how to access intermediate steps of a pipeline. Here, I access the PowerTransformer() of the numeric_pipeline with the following line:

preprocess[0].transformer_list[2][1].transformers[0][1][2]

which returns

PowerTransformer(copy=True, method='yeo-johnson', standardize=True)

which leads me to believe that I've accessed that step correctly. However, I want to pull the .lambdas_ attribute from this PowerTransformer, but when I do so, I get the following:

AttributeError: 'PowerTransformer' object has no attribute 'lambdas_'

What am I doing wrong? I ran fit() on the pipeline correctly and I'm accessing the PowerTransform() step correctly, so why am I getting an AttributeError?

Upvotes: 2

Views: 1228

Answers (1)

mrgoldtech
mrgoldtech

Reputation: 73

Okay I solved this myself.

preprocess[0].transformer_list[2][1].transformers[0][1][2].lambdas_

is incorrect. Specifically, transformer_list and transformers returns pre-fit transformers, not post-fit transformers. The following code works:

preprocess.steps[0][1].transformer_list[2][1].transformers_[0][1][2].lambdas_

Upvotes: 2

Related Questions