Reputation: 2889
I am trying to read a datatable from using db2. It seems, however, that decimals are simply ignored and floats are somehow multiplied by 100
. For instance 100.50
becomes 10050.0
when read into a pandas dataframe.
BTW I am from Norway, so decimals are denoted by a ,
rather than a .
. I dont know if this matters.
I use the following SQL (in WinSQL Lite):
SELECT CUSTOMER_ID, FUND_NAME, SAVINGS_AMOUNT
FROM SAVINGS_TABLE
WHERE CUSTOMER_ID = 1
, which gives:
CUSTOMER_ID, FUND_NAME, SAVINGS_AMOUNT
1 ASSET_X 259131,72
1 ASSET_Y 718533,33
I know this is correct, so the issue is not the data.
In Spyder, using Python, I have
import pandas as pd
import pyodbc as po
DSN = 'MY_DSN'
UID = 'MY_USER'
PWD = 'MY_PASSWORD'
CON = po.connect('DSN={0}; UID={1}; PWD={2}'.format(DSN, UID, PWD))
SQL = """SELECT CUSTOMER_ID, FUND_NAME, SAVINGS_AMOUNT
FROM SAVINGS_TABLE
WHERE CUSTOMER_ID = 1"""
df = pd.read_sql(SQL, CON)
df
Out[16]:
CUSTOMER_ID FUND_NAME SAVINGS_AMOUNT
0 1 ASSET_X 25913172.0
1 1 ASSET_Y 71853333.0
What is happening here, and how can I fix it?
The column SAVINGS_AMOUNT
is DECIMAL
with Size=9
and Scale=2
. Surely the issue is somehow related to this. I guess I have to set a parameter in pd.read_sql
to interpret the column.
I know I can simply divide by 100
to correct this. But I dont want to do that. I want to read the correct numbers.
Also I'm using Windows.
Upvotes: 14
Views: 3858
Reputation: 46
You may find it helps to explicitly set the decimal separator to one used in Norway
pyodbc.setDecimalSeparator(',')
Upvotes: 3
Reputation: 564
Have you tried setting pd.read_sql(SQL, CON, coerce_float=false)
? This maybe be an issue of attempting to change decimal to float.
See https://pandas.pydata.org/pandas-docs/version/0.20/generated/pandas.read_sql.html
EDIT:
If that does not work, you can try casting the int to a string in SQL, and then once in Python, replace the comma with a decimal and cast to float. A rather resource intensive answer, but it gets the job done.
Upvotes: 3
Reputation: 1004
In read_csv
you can specify the decimal separator, but you cannot do this in read_sql
. Your locale may not be set correctly. Before read_sql
, try this:
import locale
locale.set_locale(locale.LC_NUMERIC, "no_NO")
Alternatively, if that isn't working, you could try casting the problematic column as a string in your SQL query, and then doing a .replace
to convert the comma to a period and then finally converting it back to a number. This doesn't require you to know the precision beforehand.
Upvotes: 3
Reputation: 478
Please try this. It may work.
import numpy as np
np.round(p_table, decimals=2)
Upvotes: 3
Reputation: 1077
Not sure if you've tried forcing the SQL input:
SQL = """SELECT CUSTOMER_ID, FUND_NAME, Cast(SAVINGS_AMOUNT AS Decimal(20,2))
FROM SAVINGS_TABLE
WHERE CUSTOMER_ID = 1"""
Upvotes: 4