Reputation: 33223
I have a feature set
[x1,x2....xm]
Now I want to create polynomial feature set What that means is that if degree is two, then I have the feature set
[x1.... xm,x1^2,x2^2...xm^2, x1x2, x1x3....x1,xm......xm-1x1....xm-1xm]
So it contains terms of only of order 2.. same is if order is three.. then you will have cubic terms as well..
How to do this?
Edit 1: I am working on a machine learning project where I have close to 7 features... and a non-linear regression on this linear features are giving ok result...Hence I thought that to get more number in features I can map these features to a higher dimension.. So one way is to consider polynomial order of the feature vector... Also generating x1*x1 is easy.. :) but getting the rest of the combinations are a bit tricky..
Can combinations give me x1x2x3 result if the order is 3?
Upvotes: 5
Views: 3139
Reputation: 60
Using Karl's answer as inspiration, try using product and then taking advantage of the set object. Something like,
set([set(comb) for comb in itertools.product(range(5),range(5)])
This will get rid of recurring pairs. Then you can turn the set back into a list and sort it or iterate over it as you please.
EDIT:
this will actually kill the x_m^2
terms, so build sorted tuples instead of sets. this will allow the terms to be hashable and nonrepeating.
set([tuple(sorted(comb)) for comb in itertools.product(range(5),range(5))])
Upvotes: 0
Reputation: 86
Use
itertools.combinations(list, r)
where list
is the feature set, and r is the order of desired polynomial features. Then multiply elements of the sublists given by the above. That should give you {x1*x2, x1*x3, ...}
. You'll need to construct other ones, then union all parts.
[Edit]
Better: itertools.combinations_with_replacement(list, r)
will nicely give sorted length-r tuples with repeated elements allowed.
Upvotes: 5
Reputation: 61509
You could use itertools.product
to create all the possible sets of n values that are chosen from the original set; but keep in mind that this will generate (x2, x1)
as well as (x1, x2)
.
Similarly, itertools.combinations
will produce sets without repetition or re-ordering, but that means you won't get (x1, x1)
for example.
What exactly are you trying to do? What do you need these result values for? Are you sure you do want those x1^2
type terms (what does it mean to have the same feature more than once)? What exactly is a "feature" in this context anyway?
Upvotes: 3