Reputation:
Whenever I am using Sklearn's Polynomial Features and converting 'X' values to make it Polynomial by this code,
Before that My X value are:-
[[ 1 11]
[ 2 12]
[ 3 13]
[ 4 14]
[ 5 15]
[ 6 16]
[ 7 17]
[ 8 18]
[ 9 19]
[10 20]]
Note: It has multiple X values that mean it has more than one independent variable
poly = PolynomialFeatures(degree=2)
X_poly = poly.fit_transform(X)
print(X_poly)
Sklearn is returning this matrix having more columns besides having all Squared values,
[[ 1. 1. 11. 1. 11. 121.]
[ 1. 2. 12. 4. 24. 144.]
[ 1. 3. 13. 9. 39. 169.]
[ 1. 4. 14. 16. 56. 196.]
[ 1. 5. 15. 25. 75. 225.]
[ 1. 6. 16. 36. 96. 256.]
[ 1. 7. 17. 49. 119. 289.]
[ 1. 8. 18. 64. 144. 324.]
[ 1. 9. 19. 81. 171. 361.]
[ 1. 10. 20. 100. 200. 400.]]
I have seen this Stackoverflow Answer https://stackoverflow.com/a/51906400/12188405 when I web searched for my issue.
So can anyone please tell me a general formula OR a python code that can return that matrix respective to any degree value? In simple words, I want to make a python program that can do it having one Parameter that is a degree (which can be any value from 0 to infinity) and it will return me that Matrix-like Sklearn gives.
Upvotes: 0
Views: 1652
Reputation: 21
This piqued my interest while I was working on similar problem. To expand on @Reza Soltani response, the PolynomialFeatures(d)
uses itertools.combinations_with_replacement
or similar combinations functions to loop through degree from 1 to "d":
import math
import itertools
X = [2,3,4]
degree = 3
res = []
for i in range(1, degree+1):
C = list(itertools.combinations_with_replacement(X, i))
for j in range(len(C)):
out.res(math.prod(C[j]))
print(res)
print(len(res))
# [2, 3, 4, 4, 6, 8, 9, 12, 16, 8, 12, 16, 18, 24, 32, 27, 36, 48, 64]
# 19
# degree = 1, 3 elems: [2, 3, 4]
# degree = 2, 6 elems: [4, 6, 8, 9, 12, 16]
# degree = 3, 10 elems: [8, 12, 16, 18, 24, 32, 27, 36, 48, 64]
Upvotes: 0
Reputation: 151
I suggest you read the source code of Sklearn PolynomialFeatures in this link.
It has two different options:
interaction_only=True
interaction_only=False
The first one uses the combinations method of itertools package, and the second one uses combinations_with_replacement for creating new features.
Upvotes: 2
Reputation: 33770
You could use the get_feature_names()
method to check the names of the columns in the returned matrix:
from sklearn.preprocessing import PolynomialFeatures
import numpy as np
X = np.arange(6).reshape(3, 2)
poly = PolynomialFeatures(10)
poly.fit(X)
poly.get_feature_names(['first', 'second'])
which will output
Out[12]:
['1',
'first',
'second',
'first^2',
'first second',
'second^2',
'first^3',
'first^2 second',
'first second^2',
'second^3',
'first^4',
'first^3 second',
'first^2 second^2',
'first second^3',
'second^4',
'first^5',
'first^4 second',
'first^3 second^2',
'first^2 second^3',
'first second^4',
'second^5',
'first^6',
'first^5 second',
'first^4 second^2',
'first^3 second^3',
'first^2 second^4',
'first second^5',
'second^6',
'first^7',
'first^6 second',
'first^5 second^2',
'first^4 second^3',
'first^3 second^4',
'first^2 second^5',
'first second^6',
'second^7',
'first^8',
'first^7 second',
'first^6 second^2',
'first^5 second^3',
'first^4 second^4',
'first^3 second^5',
'first^2 second^6',
'first second^7',
'second^8',
'first^9',
'first^8 second',
'first^7 second^2',
'first^6 second^3',
'first^5 second^4',
'first^4 second^5',
'first^3 second^6',
'first^2 second^7',
'first second^8',
'second^9',
'first^10',
'first^9 second',
'first^8 second^2',
'first^7 second^3',
'first^6 second^4',
'first^5 second^5',
'first^4 second^6',
'first^3 second^7',
'first^2 second^8',
'first second^9',
'second^10']
Upvotes: 0