Reputation: 83
I am trying to define custom transformers with parameters, and use them in a sklearn.compose.ColumnTransformer. I do not understand why my custom transformers parameters are not taken into account when I run fit_transform() on the ColumnTransformer.
The script below shows an oversimplified example of the issue I'm facing. The console output of the script is:
TRUE
FALSE
------
FALSE
FALSE
Why are both BlankTransformers initialized with default value when I call for fit_transform?
import numpy as np
import pandas as pd
from sklearn.compose import ColumnTransformer
from sklearn.base import BaseEstimator, TransformerMixin
class BlankTransformer(BaseEstimator, TransformerMixin):
def __init__(self, test_bool=False):
if(test_bool):
print("TRUE")
else:
print("FALSE")
def fit(self, X, y=None):
return self
def transform(self, X, y=None):
return X
df = pd.DataFrame(np.array([[1, 2, 3, 4], [4, 5, 6, 7], [7, 8, 9, 10]]),
columns=['a', 'b', 'c', 'd'])
column_transformer = ColumnTransformer(
[('true', BlankTransformer(True), ['a', 'b']),
('false', BlankTransformer(False), ['c', 'd'])],
remainder='passthrough')
print("------")
df = column_transformer.fit_transform(df)
Upvotes: 1
Views: 682
Reputation: 4221
You are missing the assignment of the test_bool
boolean to self
in the __ init__
step. Once you have done that, you will get the expected results from your print statement:
def __init__(self, test_bool=False):
self.test_bool = test_bool
if self.test_bool:
print("TRUE")
else:
print("FALSE")
Upvotes: 1