H.Leger
H.Leger

Reputation: 83

Why is ColumnTransformer not taking transformer arguments when running?

I am trying to define custom transformers with parameters, and use them in a sklearn.compose.ColumnTransformer. I do not understand why my custom transformers parameters are not taken into account when I run fit_transform() on the ColumnTransformer.

The script below shows an oversimplified example of the issue I'm facing. The console output of the script is:

TRUE
FALSE
------
FALSE
FALSE

Why are both BlankTransformers initialized with default value when I call for fit_transform?

import numpy as np
import pandas as pd
from sklearn.compose import ColumnTransformer
from sklearn.base import BaseEstimator, TransformerMixin


class BlankTransformer(BaseEstimator, TransformerMixin):
    def __init__(self, test_bool=False):
        if(test_bool):
            print("TRUE")
        else:
            print("FALSE")

    def fit(self, X, y=None):
        return self

    def transform(self, X, y=None):
        return X


df = pd.DataFrame(np.array([[1, 2, 3, 4], [4, 5, 6, 7], [7, 8, 9, 10]]), 
                            columns=['a', 'b', 'c', 'd'])

column_transformer = ColumnTransformer(
      [('true', BlankTransformer(True), ['a', 'b']),
       ('false', BlankTransformer(False), ['c', 'd'])],
    remainder='passthrough')

print("------")

df = column_transformer.fit_transform(df)

Upvotes: 1

Views: 682

Answers (1)

MaximeKan
MaximeKan

Reputation: 4221

You are missing the assignment of the test_bool boolean to self in the __ init__ step. Once you have done that, you will get the expected results from your print statement:

def __init__(self, test_bool=False):
    self.test_bool = test_bool
    if self.test_bool:
        print("TRUE")
    else:
        print("FALSE")

Upvotes: 1

Related Questions