Reputation: 361
I scaled a matrix based on its columns, like this:
scaler = MinMaxScaler(feature_range=(-1, 1))
data = np.array([[-1, 2], [-0.5, 6], [0, 10], [1, 18]])
scaler = scaler.fit(data)
data_scaled = scaler.transform(data)
the data_scaled
gave me the following:
array([[-1. , -1. ],
[-0.5, -0.5],
[ 0. , 0. ],
[ 1. , 1. ]])
Which is the desired output. However, I'm trying to inverse the scaling of the first column of this matrix, so I tried the following (the error is shown below each line of code):
scaler.inverse_transform(data_scaled[:,1].reshape(1,-1))
Traceback (most recent call last):
File "c:\anaconda3\lib\site-packages\IPython\core\interactiveshell.py", line 2862, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "<ipython-input-38-6316f51586e7>", line 1, in <module>
scaler.inverse_transform(data_scaled[:,1].reshape(1,-1))
File "c:\anaconda3\lib\site-packages\sklearn\preprocessing\data.py", line 385, in inverse_transform
X -= self.min_
ValueError: operands could not be broadcast together with shapes (1,4) (2,) (1,4)
Also, I tried:
scaler.inverse_transform(data_scaled[:,1].reshape(-1,1))
Traceback (most recent call last):
File "c:\anaconda3\lib\site-packages\IPython\core\interactiveshell.py", line 2862, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "<ipython-input-39-397382ddb3fd>", line 1, in <module>
scaler.inverse_transform(data_scaled[:,1].reshape(-1,1))
File "c:\anaconda3\lib\site-packages\sklearn\preprocessing\data.py", line 385, in inverse_transform
X -= self.min_
ValueError: non-broadcastable output operand with shape (4,1) doesn't match the broadcast shape (4,2)
So, how to rescale the first column of that matrix?
Upvotes: 20
Views: 60494
Reputation: 631
sklearn.preprocessing.MinMaxScaler has attributes like min_ and scale_
you could transfer these attribute of that particular column to a new empty minmaxscaler that would solve your problem.
transfer of attributes
Upvotes: 20
Reputation: 402413
scaler
remembers that you passed it a 2D input with two columns, and works under the assumption that all subsequent data passed to it will have the same number of features/columns.
If it's only the first column you want, you will still need to pass inverse_transform
an input with the same number of columns. Take the first column from the result and discard the rest.
scaler.inverse_transform(data_scaled)[:, [0]]
array([[-1. ],
[-0.5],
[ 0. ],
[ 1. ]])
This is somewhat wasteful, but is a limitation of the sklearn API.
Upvotes: 21