Reputation: 3355
I'm trying to use read_sql_query()
to read a query from MySQL database, one of the field in the database, its type is double(24, 8)
, I want to use dtype=
parameter to have full control of the datatypes and read it to decimal
, but seems like pandas can't recognize decimal
type so I had to read it to Float64
In the database, the values for this field look like this:
Value
100.96000000
77.17000000
1.00000000
0.12340000
Then I'm trying to read it from Python code:
from decimal import *
dtypes = {
'id': 'Int64',
'date': 'datetime64',
'value': 'Float64'
}
df = pd.read_sql_query(sql_query, mysql_engine, dtype=dtypes)
but after reading the data from the code above, it looks like this:
Value
100.96
77.17
1.0
0.1234
How can I read this column to decimal
and keep all the digits? Thanks.
Upvotes: 1
Views: 930
Reputation: 142472
What "the data looks like in the database" is tricky. This is because the act of printing it out feeds the bits through a formatting algorithm. In this case it removes trailing zeros. To see what is "in the database", one needs to get a hex dump of the file and then decipher it; this is non-trivial.
I believe that DECIMAL
numbers hold all the digits specified, packed 2 digits per byte. No, I don't know how they are packed (0..99 versus 2 hex digits; what to do if the number of digits is odd; where is the sign?)
I believe that FLOAT
and DOUBLE
exactly conform to IEEE-764 encoding format. No, I don't know how the bytes are stored (big-endian vs little-endian). I suspect Python's Float64
is IEEE DOUBLE
.
For DECIMAL(10,6)
, I would expect to see "1.234" to be stored as +
, 0001
, and 234000
, but never displayed with leading zeros and optionally displayed with trailing zeros -- depending on the output formatting package.
For DOUBLE
, I would expect to find hex 3ff3be76c8b43958
after adjusting for endianism, and I would not be surprised to see the output be 1.23399999999999999e+0
. (Yes, I actually got that, given a suitable formatting in PHP, which I am using.) I would hope to see 1.234
since that is presumably the intent of the number.
Do not use DOUBLE(m,n)
. The (m,n)
leads to extra rounding and it is deprecated syntax. Float and Double are not intended for exact number of decimal places; use DECIMAL
for such.
For FLOAT
: 1.234 becomes hex 3f9df3b6
and displays something like 1.2339999675751
assuming the output method works in DOUBLE
and is asked to show lots of decimal places.
Bottom line: The output method you are using is causing the problem.
Upvotes: 0