ShakesBeer
ShakesBeer

Reputation: 293

Numpy Linalg norm behaving oddly (wrongly)

I have a large vector F with a few million entries that gives this inconsistent behaviour when taking norms.

np.linalg.norm(F,2.000001)=3225.96..
np.linalg.norm(F,2)=inf
np.linalg.norm(F,1.999999)=3226.01..
np.linalg.norm(F,1)=inf
---------
np.linalg.norm(F)=inf
np.linalg.norm(F/12)=inf
np.linalg.norm(F/13)=246.25
---------
np.sum(F*F)=inf
np.sum(F*F/169)=60639
np.sum(F*F/144)=inf
---------
np.all(np.isfinite(F))=True
np.max(np.abs(F))=11
---------
F.dtype=dtype('float16')

Aside from some sort of hacky solution, does anyone have any idea what's going on?

Upvotes: 3

Views: 2247

Answers (2)

Manu J4
Manu J4

Reputation: 2859

There seems to be no fix from Numpy yet. So, for completeness, another (quite obvious) solution from my side for calculating a norm:

def calcNorm(vector):
    if (vector.dtype == np.float16):
        vector = vector.astype(np.float32)
    return np.linalg.norm(vector)

Or, as I needed it, in the use case of normalizing a vector:

def normalize(vector):
    prevType = vector.dtype
    if (vector.dtype == np.float16):
        vector = vector.astype(np.float32)
    norm = np.linalg.norm(vector)
    if (norm != 0 and np.isfinite(norm)):
        vector /= norm
    return vector.astype(prevType)

Upvotes: 0

Eric
Eric

Reputation: 97575

As described in the comments, your issue is that float16 is too small to represent the intermediate results - its maximum value is 65504. A much simpler test-case is:

np.linalg.norm(np.float16([1000]))

To avoid overflow, you can divide by your largest value, and then remultiply:

def safe_norm(x):
    xmax = np.max(x)
    return np.linalg.norm(x / xmax) * xmax

There's perhaps an argument that np.linalg.norm should do this by default for float16

Upvotes: 3

Related Questions