Reputation: 343
While doing some polynomial transformation for my set of features I was reading sklearn.preprocessing
PolynomialFeatures
transformer, but I realized that the transformation includes all the possible combinations even using the interaction_only=True
parameter. I was wondering if there is a way to specify that just some interactions (combinations) are needed. For instance,
Given the following dataframe:
a b c Z X W
0 1 2 3 0 1 0
1 1 2 3 1 0 1
2 1 2 3 0 0 1
Let's say that a,b,c belongs to a type of feature and Z W X to a different one and we are just interested in interactions between features from different types.
So the desired output would just contains original features and the interactions between different type features. Of course by setting interaction_only=True you just get the "real interactions" and avoid features like a^2, Z^2 and so on...
a b c Z X W a*Z a*X a*W b*Z b*X b*W c*Z c*X c*W
0 0 1 2 3 0 1 0 0 1 0 0 2 0 0 3
3 1 1 2 3 1 0 1 1 0 1 2 0 2 3 0
3 2 1 2 3 0 0 1 0 0 1 0 0 2 0 0
I would like just to perform interactions between columns a, b, c and Z, X, W
and avoid combinations such as a*c
or Z*X
Upvotes: 1
Views: 953
Reputation: 666
There does not seems to be any way to obtain the transformation you talk about with the transformer provided by scikit learn but you can build your own transformer to dot it
Upvotes: 2