specify data type of columns in a dataframe returned from SQL server

Question

I am retrieving data from a SQL Server database using pandas with the line below.

df = pd.read_sql_query(query, cnxn)

So a dataframe is returned which is want I want. However I have noticed that the columns are not always the correct data type, for example sometimes a number will be a string.

I was wondering what is the best way to get around this?

1) should I initialise an empty dataframe with the correct dtypes for the columns and then populate the dataframe by looping through the cursor result

2) use the dataframe (df in the example above) that is returned and use astype() & other convertors on the columns that require conversion

3) or is there a way to specify in read_sql_query what data type you are expecting for each column from your query

Josh Friedlander · Accepted Answer

By default you have coerce_float=True, and you can feed a list of date columns into parse_dates. You don't have explicit dtypes support as in read_csv and other IO methods. There's a discussion about it here.

specify data type of columns in a dataframe returned from SQL server

Answers (1)

Related Questions