Best approach to use numpy 1d arrays for linear algebra

Question

I am having a lot of "small" issues when it comes to using numpy for linear algebra manipulations, due to the way numpy treats "vectors", or 1d arrays, giving inconsistent behaviour in my eyes.

My question is if I am making some glaring mistake in how I use numpy arrays for linear algebra, or if this is just how it works and there are no other obvious way to do this?

For example, lets say I want to perform univariate OLS on two vectors.

import numpy as np
from numpy import linalg as la

y = np.arange(10)
x = np.arange(10)
print(x.shape)

ols = la.inv(x.T@x)@(x.T@y)

LinAlgError: 0-dimensional array given. Array must be at least two-dimensional

So one solution is to force the arrays to have that extra dimension:

import numpy as np
from numpy import linalg as la

y = np.arange(10).reshape(-1, 1)
x = np.arange(10).reshape(-1, 1)

ols = la.inv(x.T@x)@(x.T@y)
print(ols)
>>> [[1.]]

Then one could think that problem is solved! But not exactly. If I have more dimensions on the X-axis, calculating t-values becomes a problem.

y = np.arange(10).reshape(-1, 1)
X = np.arange(20).reshape(10, 2)
b_hat = la.inv((X.T@X))@(X.T@y)

# Calculate standard errors
residual = y - X@b_hat
sigma_hat = residual.T@residual/(y.size - b_hat.size)
b_var = sigma_hat*la.inv(X.T@X)
b_std = np.sqrt(b_var.diagonal())  # The diagonal method returns 1d array.

# Calculate t-values
t_values = b_hat/b_std
print(t_values)

>>> [[2.47854011e+13 2.67712930e+13]
 [1.40888694e+00 1.52177182e+00]]

Which of course was not intended. Why does this happen? It's because np.sqrt(b_var.diagonal()) returns a (2,) shape for b_std. So when I divide b_hat/b_std numpy checks if they are of the same shape, which they are not (b_hat has (2, 1) shape), and numpy does not make a "true" division, but some other type of division.

The solution for this is of course to again use .reshape(-1, 1), but I am going to have increasingly complex calculations, so it's cumbersome to always check if a vector is returned as a 1d or 2d array, and then reshape it. Which is also prone to errors, if I reshape a matrix accidentaly into a vector.

So again, am I making some glaring mistake in how I use numpy arrays for linear algebra, or if this is just how it works and there are no other obvious way to do this?

fountainhead · Accepted Answer

Perhaps remembering these rules might help provide the caution you are looking for, while using numpy (the terms axes and dimensions mean the same here):

In Mathematics, when we write down a series of numbers [1 2 3 4], the semantics that we choose to associate with that notation vary a bit loosely with context. There are times when we consider it as a single-axis array (which is the correct semantics), but there are times when we treat it as "1 row, 4 columns". How else would you justify mathematicians' claim that a column vector, when transposed, gives a row vector, and vice-versa? The term "Transpose" means interchange of rows and columns, which itself implies that are two axes and not just one. In the case of numpy, the semantics for the same thing would consistently and strictly be "a single axis of length 4" and not "first axis of length 1 and second axis of length 4".
In numpy, as in the case of Mathematics, the idea of transpose makes sense only if you have at least two axes. As noted above, in Mathematics, we do not have consistent notation that distinguishes a single-axis array from a two-axes array, and so this rule is really moot. In numpy performing arr.T simply returns arr, if arr happens to be a single-axis array.
In numpy, we can add an extra axis of unit length at any position we choose. One notation for this is arr.reshape(n1,n2,...1,...,nk) (that is by inserting a 1 in the midst of those existing comma-separated axis-lengths). Another way is by using the indexing notation arr[:,:,...,None,...,:] (that is, by having as many comma-separated colons as there are axes, and inserting a None amongst them). In the place of None, np.newaxis can also be used, but it's a bit more verbose.
Based on the above, we might expect the numpy matrix multiplication operator @ to throw an error in arr @ arr.T if arr were to have single axis (e.g., shape (3,)). (How could matrix multiplication be defined for single-axis arrays?) Instead, the expression returns the sum-product of the elements of arr and arr.T, and returns it as a scalar (doesn't even return it as a single-element array).
In numpy, arithmetic and comparison operators, when used between two arrays of the same shape, will get applied "element-wise" (which means between each pair of corresponding elements belonging to the two arrays). This would result in a new array, whose shape is the same as that of the operand arrays.
With arithmetic and comparison operators, if the two operands are arrays whose shapes are different but broadcastable to a common shape, again, the operator gets applied element-wise after the broadcasting, and the result would again be an array with broadcast-generated shape.
With arithmetic and comparison operators, if one of the operands is a scalar, the scalar will be treated as an array of shape (1,), and the previous (broadcasting) rule will then be applied.
While these points 5,6,7 above actually add to the expressive power of numpy, they frequently surprise new learners. For example, 1.0 / arr where arr is [1 2 3 4] will produce a new array consisting of values [1.0/1 1.0/2 1.0/3 1.0/4]. (I think this was one of the surprises you had faced when you had performed a division)
If arr has a shape of (3,4,1,5,2,1,1), then arr.squeeze() will get rid of the unit-length axes, thus returning an array of shape (3,4,5,2)
When we index a multi-dimensional array, we normally expect the result to have lower dimensionality (fewer axis, and/or smaller lengths for the same axes) or same dimensionality, as the array being indexed. In numpy, and indexing expression such as arr[my_index_arr] can produce a shape that is more complex and has higher dimensionality than the original array arr. Again, this is a powerful expressive feature that can sometimes surprise/confuse new learners. In numpy, this is called Advanced Indexing with Integer Arrays

To stress one point from the above, just be extra careful about your expectations, when your array has a single axis (of shape like (L,)).

Best approach to use numpy 1d arrays for linear algebra

Answers (2)

Related Questions