Ando Jurai
Ando Jurai

Reputation: 1049

Numpy float storage into Dataframe seems to be wrongly managed

I just used this dataframe to test an algorithm for statistics:

d1=pd.DataFrame([[0.1,0.2],[0.3,0.4],[0.5,0.6],[0.7,0.8],[0.9,0.81],[0.91,0.82],[0.93,0.94],[0.95,0.96],[0.97,0.98],[0.99,1]])

recalling:

performing

 d1=pd.DataFrame([[0.1,0.2],[0.3,0.4],[0.5,0.6],[0.7,0.8],[0.9,0.81],[0.91,0.82],[0.93,0.94],[0.95,0.96],[0.97,0.98],[0.99,1]]).astype(np.float)

or

   d1=pd.DataFrame([[0.1,0.2],[0.3,0.4],[0.5,0.6],[0.7,0.8],[0.9,0.81],[0.91,0.82],[0.93,0.94],[0.95,0.96],[0.97,0.98],[0.99,1]], dtype=np.float)

doesn't change the results

On the other hand, b=np.float(0.2) and c=np.float(0.6) give correct values when recalled.

Did I miss something or is there really a problem with data management in pandas? It is very important to me as I need precision for my data.

Thanks

Upvotes: 0

Views: 473

Answers (1)

hpaulj
hpaulj

Reputation: 231738

Using np.array rather than pandas, compare the display of one element:

x=np.array([[0.1,0.2],[0.3,0.4],[0.5,0.6],[0.7,0.8],[0.9,0.81],[0.91,0.82],[0.93,0.94],[0.95,0.96],[0.97,0.98],[0.99,1]])

x[0,1]
Out[47]: 0.20000000000000001

float(x[0,1])
Out[48]: 0.2

np.float(x[0,1])   # np.float32
Out[49]: 0.2

np.float64(x[0,1])
Out[50]: 0.20000000000000001

When showing the full 64, we see that extra nonzero values off at the end, but with the short 32 type we don't.

This the same information that Warren provided in comments.

Upvotes: 1

Related Questions