Reputation: 2698
I have variables that looks like these:
data.head()
Ones Population Profit
0 1 6.1101 17.5920
1 1 5.5277 9.1302
2 1 8.5186 13.6620
3 1 7.0032 11.8540
4 1 5.8598 6.8233
X = data.iloc[:, 0:cols]
y = data.iloc[:, cols]
X1 = np.matrix(X.values)
y1 = np.matrix(y.values)
X.shape
>>(97, 2)
y.shape
>>(97,)
X1.shape
>>(97, 2)
y1.shape
>>(1, 97)
data
is in pandas frame.
I expected the dimension of y1 would be 97 X 1, but instead it is 1 X 97. Somehow y1 was transposed in the middle, and I don't understand why this is happening. Since my original y panda array was 97 X 1, I thought y1 should be the same too, but apparently thats not how it works
Any explanations?
Upvotes: 1
Views: 1751
Reputation: 184
y.values
converts the column into a numpy array, which has 1 dimension, like
[1, 2, 3, 4, 5]
if you call np.matrix
on that array, it will return
[[1, 2, 3, 4, 5]]
However, if you transpose the 1 dimension array into 2 dimension first before you call np.matrix
, you will get (5, 1) matrix,
>>> a = np.array([1, 2, 3, 4, 5])
>>> a.shape
(5,)
>>> a
array([1, 2, 3, 4, 5])
>>> np.matrix(a).shape
(1, 5)
>>> a.reshape(-1, 1)
array([[1],
[2],
[3],
[4],
[5]])
>>> np.matrix(a.reshape(-1, 1)).shape
(5, 1)
>>> np.matrix(a.reshape(-1, 1))
matrix([[1],
[2],
[3],
[4],
[5]])
Upvotes: 1
Reputation: 52236
Unsolicited advice, use of matrix isn't really recommended. The biggest thing it bought you was operator *
for matrix multiplication, but with python's 3.5 matmul operator @
that's not really necessary.
That said, they key thing to note here is that the shape of y
is not 97 x 1, it is 97, that is a one dimensional array. A numpy matrix
is always two dimensional, and simply by convention a 1-d array is a converted into a 1 x X
matrix.
Upvotes: 1