wawawa
wawawa

Reputation: 3355

pandas `read_sql_query` - read `double` datatype in MySQL database to `Decimal`

I'm trying to use read_sql_query() to read a query from MySQL database, one of the field in the database, its type is double(24, 8), I want to use dtype= parameter to have full control of the datatypes and read it to decimal, but seems like pandas can't recognize decimal type so I had to read it to Float64

In the database, the values for this field look like this:

Value
100.96000000
77.17000000
1.00000000
0.12340000

Then I'm trying to read it from Python code:

from decimal import *

dtypes = {
        'id': 'Int64',
        'date': 'datetime64',
        'value': 'Float64'
    }

df = pd.read_sql_query(sql_query, mysql_engine, dtype=dtypes)

but after reading the data from the code above, it looks like this:

Value
100.96
77.17
1.0
0.1234

How can I read this column to decimal and keep all the digits? Thanks.

Upvotes: 1

Views: 930

Answers (1)

Rick James
Rick James

Reputation: 142472

What "the data looks like in the database" is tricky. This is because the act of printing it out feeds the bits through a formatting algorithm. In this case it removes trailing zeros. To see what is "in the database", one needs to get a hex dump of the file and then decipher it; this is non-trivial.

I believe that DECIMAL numbers hold all the digits specified, packed 2 digits per byte. No, I don't know how they are packed (0..99 versus 2 hex digits; what to do if the number of digits is odd; where is the sign?)

I believe that FLOAT and DOUBLE exactly conform to IEEE-764 encoding format. No, I don't know how the bytes are stored (big-endian vs little-endian). I suspect Python's Float64 is IEEE DOUBLE.

For DECIMAL(10,6), I would expect to see "1.234" to be stored as +, 0001, and 234000, but never displayed with leading zeros and optionally displayed with trailing zeros -- depending on the output formatting package.

For DOUBLE, I would expect to find hex 3ff3be76c8b43958 after adjusting for endianism, and I would not be surprised to see the output be 1.23399999999999999e+0. (Yes, I actually got that, given a suitable formatting in PHP, which I am using.) I would hope to see 1.234 since that is presumably the intent of the number.

Do not use DOUBLE(m,n). The (m,n) leads to extra rounding and it is deprecated syntax. Float and Double are not intended for exact number of decimal places; use DECIMAL for such.

For FLOAT: 1.234 becomes hex 3f9df3b6 and displays something like 1.2339999675751 assuming the output method works in DOUBLE and is asked to show lots of decimal places.

Bottom line: The output method you are using is causing the problem.

Upvotes: 0

Related Questions