Reputation: 3086
I am learning Pipelines and FeatureUnions in scikit-learn and thus wondering whether it is possible to repeated apply 'make_union' on a class?
Consider the following code:
import numpy as np
import pandas as pd
from sklearn.base import BaseEstimator, TransformerMixin
from sklearn.pipeline import Pipeline, FeatureUnion
from sklearn.linear_model import LogisticRegression
import sklearn.datasets as d
class IrisDataManupulation(BaseEstimator, TransformerMixin):
"""
Raise the matrix of feature in power
"""
def __init__(self, power=2):
self.power = power
def fit(self, X, y=None):
return self
def transform(self, X):
return np.power(X, self.power)
iris_data = d.load_iris()
X, y = iris_data.data, iris_data.target
# feature union:
fu = FeatureUnion(transformer_list=[('squared', IrisDataManupulation(power=2)),
('third', IrisDataManupulation(power=3))])
QUESTION Any neat way to create the FeatureUnion without repeating the same transformer, but rather passing a list of parameters?
For example:
fu_new = FeatureUnion(transformer_list=[('raise_power', IrisDataManupulation(),
param_grid = {'raise_power__power':[2,3]})
Upvotes: 1
Views: 169
Reputation: 36619
You can move all the powers work inside a single custom Transformer. We can change your IrisDataManupulation
to handle the list of powers inside it:
class IrisDataManupulation(BaseEstimator, TransformerMixin):
def __init__(self, powers=[2]):
self.powers = powers
def transform(self, X):
powered_arrays = []
for power in self.powers:
powered_arrays.append(np.power(X, power))
return np.hstack(powered_arrays)
Then you can just use this new transformer instead of FeatureUnion:
fu = IrisDataManupulation(powers=[2,3])
Note: If you want to generate polynomial features from your original features, I would recommend to see PolynomialFeatures, which can generate the powers you want in addition to other interactions between features.
Upvotes: 2