Philip Roland Jarnhus
Philip Roland Jarnhus

Reputation: 106

Dask/Python: Converting char to int for index column in read_sql_table

I have a difficult time wrapping my head around the Dask read_sql_table method. I manage to connect to the database just fine, but the column I wish to use as my index column is a char type holding only integers.

I have tried:

from urllib.parse import quote_plus
server = 'SERVER_NAME'
db = 'DB_NAME'
table = 'TABLE_NAME'
connection_string = 'DRIVER={SQL Server};SERVER=' + server + ';DATABASE=' + db + ';Trusted_Connection=yes'
connection_string = quote_plus(connection_string)
connection_string = 'mssql+pyodbc:///?odbc_connect='+connection_string
df = dd.read_sql_table(table,connection_string,'sql.cast(sql.column("ID"),int).label("ID")')

(The server, db, table and column names have all been replaced with dummies here, as it is the company database)

The error I get is:

KeyError                                  Traceback (most recent call last)
<ipython-input-25-8e261dcd8696> in <module>()
  6 connection_string = quote_plus(connection_string)
  7 connection_string = 'mssql+pyodbc:///?odbc_connect='+connection_string
----> 8 df = dd.read_sql_table(table,connection_string,'sql.cast(sql.column("ID"),int).label("ID")')

~\AppData\Local\Continuum\Anaconda3\lib\site-packages\dask\dataframe\io\sql.py in read_sql_table(table, uri, index_col, divisions, npartitions, limits, columns, bytes_per_chunk, **kwargs)
 73                          schema=schema)
 74 
---> 75     index = (table.columns[index_col] if isinstance(index_col, six.string_types)
 76              else index_col)
 77     if not isinstance(index_col, six.string_types + (elements.Label,)):

~\AppData\Local\Continuum\Anaconda3\lib\site-packages\sqlalchemy\util\_collections.py in __getitem__(self, key)
192 
193     def __getitem__(self, key):
--> 194         return self._data[key]
195 
196     def __delitem__(self, key):

KeyError: 'sql.cast(sql.column("ID"),int).label("ID")'

Anyone knows how to fix it?

Upvotes: 1

Views: 937

Answers (1)

Philip Roland Jarnhus
Philip Roland Jarnhus

Reputation: 106

OK... All it took was a bit more mucking about. The error was that I tried to pass the SQLalchemy expression as a string. I should have loaded the SQLalchemy modules and written it as a proper expression:

from urllib.parse import quote_plus
from sqlalchemy import sql, types
server = 'SERVER_NAME'
db = 'DB_NAME'
table = 'TABLE_NAME'
connection_string = 'DRIVER={SQL Server};SERVER=' + server + ';DATABASE=' + db + ';Trusted_Connection=yes'
connection_string = quote_plus(connection_string)
connection_string = 'mssql+pyodbc:///?odbc_connect='+connection_string
df = dd.read_sql_table(table,connection_string,sql.cast(sql.column("ID"),types.BigInteger).label("ID"))

Upvotes: 4

Related Questions