Reputation: 106
I have a difficult time wrapping my head around the Dask read_sql_table method. I manage to connect to the database just fine, but the column I wish to use as my index column is a char type holding only integers.
I have tried:
from urllib.parse import quote_plus
server = 'SERVER_NAME'
db = 'DB_NAME'
table = 'TABLE_NAME'
connection_string = 'DRIVER={SQL Server};SERVER=' + server + ';DATABASE=' + db + ';Trusted_Connection=yes'
connection_string = quote_plus(connection_string)
connection_string = 'mssql+pyodbc:///?odbc_connect='+connection_string
df = dd.read_sql_table(table,connection_string,'sql.cast(sql.column("ID"),int).label("ID")')
(The server, db, table and column names have all been replaced with dummies here, as it is the company database)
The error I get is:
KeyError Traceback (most recent call last)
<ipython-input-25-8e261dcd8696> in <module>()
6 connection_string = quote_plus(connection_string)
7 connection_string = 'mssql+pyodbc:///?odbc_connect='+connection_string
----> 8 df = dd.read_sql_table(table,connection_string,'sql.cast(sql.column("ID"),int).label("ID")')
~\AppData\Local\Continuum\Anaconda3\lib\site-packages\dask\dataframe\io\sql.py in read_sql_table(table, uri, index_col, divisions, npartitions, limits, columns, bytes_per_chunk, **kwargs)
73 schema=schema)
74
---> 75 index = (table.columns[index_col] if isinstance(index_col, six.string_types)
76 else index_col)
77 if not isinstance(index_col, six.string_types + (elements.Label,)):
~\AppData\Local\Continuum\Anaconda3\lib\site-packages\sqlalchemy\util\_collections.py in __getitem__(self, key)
192
193 def __getitem__(self, key):
--> 194 return self._data[key]
195
196 def __delitem__(self, key):
KeyError: 'sql.cast(sql.column("ID"),int).label("ID")'
Anyone knows how to fix it?
Upvotes: 1
Views: 937
Reputation: 106
OK... All it took was a bit more mucking about. The error was that I tried to pass the SQLalchemy expression as a string. I should have loaded the SQLalchemy modules and written it as a proper expression:
from urllib.parse import quote_plus
from sqlalchemy import sql, types
server = 'SERVER_NAME'
db = 'DB_NAME'
table = 'TABLE_NAME'
connection_string = 'DRIVER={SQL Server};SERVER=' + server + ';DATABASE=' + db + ';Trusted_Connection=yes'
connection_string = quote_plus(connection_string)
connection_string = 'mssql+pyodbc:///?odbc_connect='+connection_string
df = dd.read_sql_table(table,connection_string,sql.cast(sql.column("ID"),types.BigInteger).label("ID"))
Upvotes: 4