user8026974
user8026974

Reputation:

How to convert int32 numpy array into int16 numpy array?

I want to conert a numpy array from int32 type to int16 type.

I have an int32 array called array_int32 and I am converting that to int16.

import numpy as np
array_int32 = np.array([31784960, 69074944, 165871616])`
array_int16 = array_int32.astype(np.int16)

After conversion, the array_int16 turns into an array of zeros. I don't know what mistake I am doing. Could anyone help me in this?

Upvotes: 7

Views: 30357

Answers (3)

Boris Reif
Boris Reif

Reputation: 143

As has been pointed out by the user jdamp the numbers are too large to be represented as 16 bit integer values. I don't know the context of your question, but it may be useful to know that a simple rescaling of the numbers can be done.

import math
import numpy as np

def scale_to(x, x_min, x_max, t_min, t_max):
    """
    Scales x to lie between t_min and t_max
    Links:
         https://stats.stackexchange.com/questions/281162/scale-a-number-between-a-range
         https://stats.stackexchange.com/questions/178626/how-to-normalize-data-between-1-and-1
    """
    r = x_max - x_min
    r_t = t_max - t_min
    assert(math.isclose(0,r, abs_tol=np.finfo(float).eps) == False)
    x_s = r_t * (x - x_min) / r + t_min
    return x_s

A conversion of these rather large values into a 16 bit format would then look like this:

array_float = np.array([31784960.12, 69074944.12, 165871616.34])
scaled_array = scale_to(array_float,np.min(array_float),np.max(array_float), -32768,32767)
array_int16 = scaled_array.astype(np.int16)

The values -32768 and 32767 are the largest and smallest value that can be represented by 16 bit. These values represent the min and max value of your input array. All other values are scaled in between. Only then as a final step the type casting is done. So, the resulting output for the values above will look like this:

array_int16

array([-32768, -14542, 32767], dtype=int16)

Please note, that I changed the input to floating point values just to show this can also be done with float values too.

The numbers can be scaled back to nearly their original value if we remember the min and max values of the original array.

def scale_inv(x_s, x_min, x_max, t_min, t_max):
    """
    Inverse scaling
    Links:
        https://stats.stackexchange.com/questions/281162/scale-a-number-between-a-range
        https://stats.stackexchange.com/questions/178626/how-to-normalize-data-between-1-and-1
    """
    r = x_max - x_min
    r_t = t_max - t_min
    assert(math.isclose(0,r_t, abs_tol=np.finfo(float).eps) == False)
    x = (x_s - t_min) * r / r_t + x_min
    return x

inv = scale_inv(array_int16.astype(float), np.min(array_float), np.max(array_float), -32768.0, 32767.0)

The last line gives us back the original values with some round-off errors:

array([3.17849601e+07, 6.90759252e+07, 1.65871616e+08])

The original values were: 31784960.12, 69074944.12, 165871616.34 (as seen above in the code)

This maybe useful for example in audio file conversions. Depending on your context this maybe helpful. (If a simple rescaling with type casting is not what you are looking for, then maybe you need to look at resampling)

Keep in mind though, some loss of information is always involved and unavoidable for the resaon given by jdamp. As an analogy: you usually cannot squeeze the contents of a full large box into a smaller box.

P.S.: For scaling see in particular this link on stack exchange: min-max-scaler Another link is given in the comments of the code.

Upvotes: 0

Mark Setchell
Mark Setchell

Reputation: 207798

You could discard the bottom 16 bits:

n=(array_int32>>16).astype(np.int16)                          

which will give you this:

array([ 485, 1054, 2531], dtype=int16

Upvotes: 10

jdamp
jdamp

Reputation: 1460

The numbers in your array_int32 are too large to be represented with 16 bits (a signed integer value with 16 bits can only represent a maximum value of 2^16-1=32767). Apparently, numpy just sets the resulting numbers to zero in this case.

This behavior can be modified by changing the optional casting argument of astype The documentation states

Starting in NumPy 1.9, astype method now returns an error if the string dtype to cast to is not long enough in ‘safe’ casting mode to hold the max value of integer/float array that is being casted. Previously the casting was allowed even if the result was truncated.

So, an additional requirement casting='safe' will result in a TypeError, as the conversion from 32 (or 64) bits downto 16, as the maximum value of original type is too large for the new type, e.g.

import numpy as np
array_int32 = np.array([31784960, 69074944, 165871616])
array_int16 = array_int32.astype(np.int16, casting='safe')

results in

TypeError: Cannot cast array from dtype('int64') to dtype('int16') according to the rule 'safe'

Upvotes: 2

Related Questions