Reputation: 1161
dask==2.27.0
python==3.8.3
Operating System== centos7
ddf.read_sql_table
.TypeError
when parameters to read_sql_table are passed as mentioned in documentation.Code:
from sqlalchemy.sql import text
from sqlalchemy.engine import create_engine
import dask.dataframe as ddf
DIALECT = '<value>'
SQL_DRIVER= '<value>'
USERNAME= '<value>'
PASSWORD = '<value>'
HOSTNAME = '<value>'
PORT = '<value>'
SID = '<value>'
ENGINE_PATH = DIALECT + '+' + SQL_DRIVER + '://' + USERNAME + ':' + PASSWORD +'@' + HOSTNAME + ':' + str(PORT) + '/' + SID
s = text("My complicated sql query")
df = ddf.read_sql_table(s, ENGINE_PATH, index_col='id', npartitions=10)
Error seen:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/synology/data/ganesh/arun/code_jan_27/art_rematching/venv_3.8_50/lib/python3.8/site-packages/dask/dataframe/io/sql.py", line 115, in read_sql_table
index = table.columns[index_col] if isinstance(index_col, str) else index_col
TypeError: 'method' object is not subscriptable
Upvotes: 0
Views: 133
Reputation: 63
So dask does not currently support complex queries directly in the text mechanism (as of v2021.02.0). My workaround is as follows:
See below for a simple example:
from sqlalchemy import Table, Metadata, Column, Integer
import dask.dataframe as ddf
import multiprocessing
uri = f'{dialect}://{user}:{password}@{host}:{port}/{dbName}'
view = '[NAME_OF_VIEW]'
schema = '[NAME_OF_SCHEMA]'
pkey = '[PRIMARY_KEY_COLUMN]'
myview = Table(view, Metadata(schema=schema), Column(pkey, Integer, primary_key=True))
df = ddf.read_sql_table(table=myview, uri=uri, index_col=pkey, schema=schema, npartitions=multiprocessing.cpu_count()*3)
Upvotes: 1