Noushadali
Noushadali

Reputation: 86

Convert numpy float to string

import numpy as np

number = np.float32(0.12345678)
assert str(number) == "0.12345678"
assert f"{number}" == "0.12345678359270096"

Why is this different when converting a numpy float to string using str built-in function and f-string?

Upvotes: 2

Views: 72

Answers (1)

Grismar
Grismar

Reputation: 31379

As pointed out in the comments, the first part of the answer would be to point out that f-string representations are often different from str() outcomes.

However, that leaves the question "why does __format__ result in more digits being rendered than __str__ for a numpy.float32?"

Have a look at this:

import numpy as np

x = np.float32(0.12345678)
print('np.float32 __str__ method return value:', x.__str__())
print('np.float32 __repr__ method return value:', x.__repr__())
print('np.float32 __format__ method return value with "f":', x.__format__('f'))
print('np.float32 formatted by an f-string:', f'{x}')

print('float __str__ method return value:', float(x).__str__())
print('float __repr__ method return value:', float(x).__repr__())
print('float __format__ method return value with "f":', float(x).__format__('f'))
print('float formatted by an f-string:', f'{float(x)}')

Output:

np.float32 __str__ method return value: 0.12345678
np.float32 __repr__ method return value: 0.12345678
np.float32 __format__ method return value with "f": 0.123457
np.float32 formatted by an f-string: 0.12345678359270096
float __str__ method return value: 0.12345678359270096
float __repr__ method return value: 0.12345678359270096
float __format__ method return value with "f": 0.123457
float formatted by an f-string: 0.12345678359270096

It's apparent that printing a numpy.float32 through an f-string actually prints the float conversion of that numpy.float32.

The f-string calls .__format__ on the numpy.float32, which first converts the value of x to a Python float and then the .__str__ is called on float normally, giving you its string representation, instead of that of the numpy.float32. (this answers the question, the below just provides some extra background)

The reason for all the extra digits that you didn't define are of course the result of floating point imprecision. Floats can only approximate specific real numbers, and when you don't specifically track the precision, you end up with representations of the closest value they can represent. 0.12345678 can't be the exact value of a float.

Edit: Note that user @markransom pointed out another interesting quirk you may run into when using Python 2, namely that __str__ and __repr__ would give different results Why does str(float) return more digits in Python 3 than Python 2?

Also, in case you are wondering what is going on with floats in detail, have a look at this:

import struct
import math

x = 0.12345678

# IEEE 754 binary representation of the float
binary = struct.unpack('!Q', struct.pack('!d', x))[0]

# Extract sign, exponent, and mantissa
sign = (binary >> 63) & 0x1
exponent = ((binary >> 52) & 0x7FF) - 1023  # Unbias the exponent
mantissa = binary & ((1 << 52) - 1)        # Lower 52 bits

# Reconstruct the value, and the next possible value
value = (-1)**sign * (1 + mantissa / 2**52) * 2**exponent
prev_value = (-1)**sign * (1 + (mantissa-1) / 2**52) * 2**exponent

print(math.isclose(x, value, rel_tol=1e-15))  # True if reconstructed correctly
print(sign, exponent, mantissa, f'{mantissa:b}')  # Show sign, exponent, and mantissa (and in binary)
print(f'{value:.56f}')  # Show exact value stored in float
print(f'{prev_value:.56f}')

Output

True
0 -4 4392398907099285 1111100110101101110100010000100100011100100010010101
0.12345678000000000207325712153760832734405994415283203125
0.12345677999999998819546931372315157204866409301757812500

This shows you the size difference for the ULP (Unit in Last Place), starting with the number you gave and the closest smaller number. Those two are the closest you can get to 0.12345678 with a float.

And finally, note that a numpy.float32 uses a different (smaller) representation, which has different limitations, which explains why you end up with a different representation - which is farther from the value than the closest Python can get.

You can see the value with:

print('np.float32 formatted by an f-string with "g":', f'{x:.27g}')

Output:

0.123456783592700958251953125

Upvotes: 5

Related Questions