Reputation: 1512
I have a uint64
column in my DataFrame, but when I convert that DataFrame to a list of python dict using DataFrame.to_dict('record')
, what's previously a uint64
gets magically converted to float:
In [24]: mid['bd_id'].head()
Out[24]:
0 0
1 6957860914294
2 7219009614965
3 7602051814214
4 7916807114255
Name: bd_id, dtype: uint64
In [25]: mid.to_dict('record')[2]['bd_id']
Out[25]: 7219009614965.0
In [26]: bd = mid['bd_id']
In [27]: bd.head().to_dict()
Out[27]: {0: 0, 1: 6957860914294, 2: 7219009614965, 3: 7602051814214, 4: 7916807114255}
How can I avoid this strange behavior?
strangely enough, if I use to_dict()
instead of to_dict('records')
, the bd_id
column will be of type int:
In [43]: mid.to_dict()['bd_id']
Out[43]:
{0: 0,
1: 6957860914294,
2: 7219009614965,
...
Upvotes: 9
Views: 8153
Reputation: 7833
You can use this
from pandas.io.json import dumps
import json
output=json.loads(dumps(mid,double_precision=0))
Upvotes: 1
Reputation: 36555
It's because another column has a float in it. More specifically to_dict('records')
is implemented using the values
attribute of the data frame rather than the columns itself, and this implements "implicit upcasting", in your case converting uint64 to float.
If you want to get around this bug, you could explicitly cast your dataframe to the object
datatype:
df.astype(object).to_dict('record')[2]['bd_id']
Out[96]: 7602051814214
By the way, if you are using IPython and you want to see how a function is implemented in a library you can brink it up by putting ??
at the end of the method call. For pd.DataFrame.to_dict??
we see
...
elif orient.lower().startswith('r'):
return [dict((k, v) for k, v in zip(self.columns, row))
for row in self.values]
Upvotes: 18