Round pandas data frame/series

Question

I have a column in a pandas data frame that looks like this (much longer but here's the top few rows):

>df_fill['col1']

0      5987.8866699999998672865
1     52215.5966699999989941716
2       201.8966700000000003001
3         3.8199999999999998401

I want to round the entire column to 5 decimal places. I can round it to integers, but not to any amount of digits after the decimal. The type for the column is float.

> np.around(df_fill['col1'], 0)

0      5988
1     52216
2       202
3         4

> np.around(df_fill['col1'], 5)

0      5987.8866699999998672865
1     52215.5966699999989941716
2       201.8966700000000003001
3         3.8199999999999998401

> (df_fill['col1']).round()

0      5988
1     52216
2       202
3         4

>(df_fill['col1']).round(5)

0      5987.8866699999998672865
1     52215.5966699999989941716
2       201.8966700000000003001
3         3.8199999999999998401

> (df_fill['col1']).round(decimals=5)

0      5987.8866699999998672865
1     52215.5966699999989941716
2       201.8966700000000003001
3         3.8199999999999998401

> str((df_fill['col1']).round(decimals=5))
'0      5987.8866699999998672865
1     52215.5966699999989941716
2       201.8966700000000003001
3         3.8199999999999998401\

What am I missing here?

unutbu · Accepted Answer

Floats can only represent a subset of the real numbers. It can only exactly represent those decimals which are sums of negative powers of two ("binary fractions"). After you round a float to 5 digits, the new float may not be the real number which has 5 decimal digits since the decimal part may not be exactly expressible as a binary fraction. Instead rounding returns the float closest to that real number.

If you have set

pd.options.display.float_format = '{:.23g}'.format

then Pandas will show up to 23 digits in its string representation of floats:

import pandas as pd

pd.options.display.float_format = '{:.23g}'.format

df_fill = pd.DataFrame({'col1':[ 5987.8866699999998672865, 52215.5966699999989941716, 
                                201.8966700000000003001, 3.8199999999999998401]})

#                       col1
# 0 5987.8866699999998672865
# 1 52215.596669999998994172
# 2 201.89667000000000030013
# 3 3.8199999999999998401279

print(df_fill['col1'].round(5))
# 0   5987.8866699999998672865
# 1   52215.596669999998994172
# 2   201.89667000000000030013
# 3   3.8199999999999998401279
# Name: col1, dtype: float64

But if you set the float_format to display 5 decimal digits:

pd.options.display.float_format = '{:.5f}'.format

then

print(df_fill['col1'].round(5))

yields

0    5987.88667
1   52215.59667
2     201.89667
3       3.82000
Name: col1, dtype: float64

Note the underlying float has not changed; only the manner in which it is displayed.

Round pandas data frame/series

Answers (2)

Related Questions