Reputation: 1049
I just used this dataframe to test an algorithm for statistics:
d1=pd.DataFrame([[0.1,0.2],[0.3,0.4],[0.5,0.6],[0.7,0.8],[0.9,0.81],[0.91,0.82],[0.93,0.94],[0.95,0.96],[0.97,0.98],[0.99,1]])
recalling:
d1.iloc[0,1]
yields 0.20000000000000001
d1.iloc[2,1]
yields 0.59999999999999998
performing
d1=pd.DataFrame([[0.1,0.2],[0.3,0.4],[0.5,0.6],[0.7,0.8],[0.9,0.81],[0.91,0.82],[0.93,0.94],[0.95,0.96],[0.97,0.98],[0.99,1]]).astype(np.float)
or
d1=pd.DataFrame([[0.1,0.2],[0.3,0.4],[0.5,0.6],[0.7,0.8],[0.9,0.81],[0.91,0.82],[0.93,0.94],[0.95,0.96],[0.97,0.98],[0.99,1]], dtype=np.float)
doesn't change the results
On the other hand, b=np.float(0.2)
and c=np.float(0.6)
give correct values when recalled.
Did I miss something or is there really a problem with data management in pandas? It is very important to me as I need precision for my data.
Thanks
Upvotes: 0
Views: 473
Reputation: 231738
Using np.array
rather than pandas, compare the display of one element:
x=np.array([[0.1,0.2],[0.3,0.4],[0.5,0.6],[0.7,0.8],[0.9,0.81],[0.91,0.82],[0.93,0.94],[0.95,0.96],[0.97,0.98],[0.99,1]])
x[0,1]
Out[47]: 0.20000000000000001
float(x[0,1])
Out[48]: 0.2
np.float(x[0,1]) # np.float32
Out[49]: 0.2
np.float64(x[0,1])
Out[50]: 0.20000000000000001
When showing the full 64, we see that extra nonzero values off at the end, but with the short 32 type we don't.
This the same information that Warren provided in comments.
Upvotes: 1