Reputation: 61
I am dividing a tensor of type int32 by a tensor of type int32, and the result is float64. I can't find answers as to why this happens or if there are implicit rules behind how Tensorflow does this. I have not explicitly defined a dtype for any tensor, but I have checked all of them, and none of them have a 64bit type until after the division.
I've tried using different formulations of division such as tf.divide, all give the same result.
My code looks like:
a_cdf = a / tf.size(a)
with a being of type tf.int32.
What I want to get is the result as float32, so I can write my function without an explicit cast.
Upvotes: 5
Views: 2572
Reputation: 59681
This is by design. "True" division in TensorFlow (that is, real division) uses a _TRUEDIV_TABLE
that specifies the casting rules for each type, and it currently reads:
# Conversion table for __truediv__. None entries mean no conversion required.
_TRUEDIV_TABLE = {
dtypes.uint8: dtypes.float32,
dtypes.int8: dtypes.float32,
dtypes.uint16: dtypes.float32,
dtypes.int16: dtypes.float32,
dtypes.int32: dtypes.float64,
dtypes.int64: dtypes.float64,
dtypes.bfloat16: None,
dtypes.float16: None,
dtypes.float32: None,
dtypes.float64: None,
dtypes.complex64: None,
dtypes.complex128: None,
}
Meaning that int32
tensors will be converted to float64
. If you want to obtain a float32
as output, either use a smaller int type or cast your inputs to float32
.
The rationale for this is another matter. If I had to guess, on the one hand I'd say if you are using 8 or 16 bit integers you are probably concerned about memory, so a smaller result type would make sense. But also, you could give the following argument:
import numpy as np
# Compute smallest positive divisions with 16 and 32 bits
smallest_16bit_fraction = 1 / ((1 << 16) - 1)
smallest_32bit_fraction = 1 / (-(1 << 31)) # 31 bits because int32 is signed
# Compute one plus the smallest fractions with 32 and 64 bit floats
print(np.float32(1) + np.float32(smallest_16bit_fraction))
# 1.0000153
print(np.float64(1) + np.float64(smallest_16bit_fraction))
# 1.0000152590218967
print(np.float32(1) + np.float32(smallest_32bit_fraction))
# 1.0
print(np.float64(1) + np.float64(smallest_32bit_fraction))
# 0.9999999995343387
So you could think that, being a division of two integer values, you may want to mix the result with an integer, but as you can see for 32 bit integers, there will be cases where a 32 bit float will underflow.
But again, this is just guessing and more of a thought exercise than anything else.
Upvotes: 2