Reputation:
I want to conert a numpy array from int32 type to int16 type.
I have an int32
array called array_int32
and I am converting that to int16
.
import numpy as np
array_int32 = np.array([31784960, 69074944, 165871616])`
array_int16 = array_int32.astype(np.int16)
After conversion, the array_int16
turns into an array of zeros. I don't know what mistake I am doing. Could anyone help me in this?
Upvotes: 7
Views: 30357
Reputation: 143
As has been pointed out by the user jdamp
the numbers are too large to be represented as 16 bit integer values. I don't know the context of your question, but it may be useful to know that a simple rescaling of the numbers can be done.
import math
import numpy as np
def scale_to(x, x_min, x_max, t_min, t_max):
"""
Scales x to lie between t_min and t_max
Links:
https://stats.stackexchange.com/questions/281162/scale-a-number-between-a-range
https://stats.stackexchange.com/questions/178626/how-to-normalize-data-between-1-and-1
"""
r = x_max - x_min
r_t = t_max - t_min
assert(math.isclose(0,r, abs_tol=np.finfo(float).eps) == False)
x_s = r_t * (x - x_min) / r + t_min
return x_s
A conversion of these rather large values into a 16 bit format would then look like this:
array_float = np.array([31784960.12, 69074944.12, 165871616.34])
scaled_array = scale_to(array_float,np.min(array_float),np.max(array_float), -32768,32767)
array_int16 = scaled_array.astype(np.int16)
The values -32768 and 32767 are the largest and smallest value that can be represented by 16 bit. These values represent the min and max value of your input array. All other values are scaled in between. Only then as a final step the type casting is done. So, the resulting output for the values above will look like this:
array_int16
array([-32768, -14542, 32767], dtype=int16)
Please note, that I changed the input to floating point values just to show this can also be done with float values too.
The numbers can be scaled back to nearly their original value if we remember the min and max values of the original array.
def scale_inv(x_s, x_min, x_max, t_min, t_max):
"""
Inverse scaling
Links:
https://stats.stackexchange.com/questions/281162/scale-a-number-between-a-range
https://stats.stackexchange.com/questions/178626/how-to-normalize-data-between-1-and-1
"""
r = x_max - x_min
r_t = t_max - t_min
assert(math.isclose(0,r_t, abs_tol=np.finfo(float).eps) == False)
x = (x_s - t_min) * r / r_t + x_min
return x
inv = scale_inv(array_int16.astype(float), np.min(array_float), np.max(array_float), -32768.0, 32767.0)
The last line gives us back the original values with some round-off errors:
array([3.17849601e+07, 6.90759252e+07, 1.65871616e+08])
The original values were: 31784960.12, 69074944.12, 165871616.34 (as seen above in the code)
This maybe useful for example in audio file conversions. Depending on your context this maybe helpful. (If a simple rescaling with type casting is not what you are looking for, then maybe you need to look at resampling
)
Keep in mind though, some loss of information is always involved and unavoidable for the resaon given by jdamp
. As an analogy: you usually cannot squeeze the contents of a full large box into a smaller box.
P.S.: For scaling see in particular this link on stack exchange: min-max-scaler Another link is given in the comments of the code.
Upvotes: 0
Reputation: 207798
You could discard the bottom 16 bits:
n=(array_int32>>16).astype(np.int16)
which will give you this:
array([ 485, 1054, 2531], dtype=int16
Upvotes: 10
Reputation: 1460
The numbers in your array_int32
are too large to be represented with 16 bits (a signed integer value with 16 bits can only represent a maximum value of 2^16-1=32767).
Apparently, numpy just sets the resulting numbers to zero in this case.
This behavior can be modified by changing the optional casting
argument of astype
The documentation states
Starting in NumPy 1.9, astype method now returns an error if the string dtype to cast to is not long enough in ‘safe’ casting mode to hold the max value of integer/float array that is being casted. Previously the casting was allowed even if the result was truncated.
So, an additional requirement casting='safe'
will result in a TypeError
, as the conversion from 32 (or 64) bits downto 16, as the maximum value of original type is too large for the new type, e.g.
import numpy as np
array_int32 = np.array([31784960, 69074944, 165871616])
array_int16 = array_int32.astype(np.int16, casting='safe')
results in
TypeError: Cannot cast array from dtype('int64') to dtype('int16') according to the rule 'safe'
Upvotes: 2