Reputation: 319
I would like to know how numpy casts from float32 to float16, because when I cast some number like 8193 from float32 to float16 using astype, it will output 8192 while 10000 of float32 casted into 10000 of float16.
import numpy as np
a = np.array([8193], dtype=np.float32)
b = a.astype(np.float16)
Upvotes: 16
Views: 57212
Reputation: 129
Tensorflow requires float16 and produces an error for float32. You can use what was suggested by Reti43:
np.float16(a)
Out[102]: array([8192.], dtype=float16)
I'm surprised that a useless reply has been upvoted so high. I know that moderators request to mark as the best answer the one which was upvoted the highest, but an question author is not obliged to do this. There are a number of people who just gather points here, and do not care about actually replying to a request. They might upvote themselves under different names.
Upvotes: -2
Reputation: 55469
The IEEE 754-2008 16-bit base 2 format, aka binary16, doesn't give you a lot of precision. What do you expect from 16 bits? :) 1 bit is the sign bit, 5 bits are used for the exponent, and that leaves 10 bits to store the normalised 11 bit mantissa, so anything > 2**11 == 2048 has to be quantized.
According to Wikipedia, integers between 4097 and 8192 round to a multiple of 4, and integers between 8193 and 16384 round to a multiple of 8.
Upvotes: 19