Reputation: 600
I am trying to upload a large dask dataframe to a sql server but i'm getting an error and can't see anything wrong with the connection string. I am able to connect to the DB using this connection string but I get the error when running the following:
import sqlalchemy as sa
import dask.dataframe as dd
from dask.diagnostics import ProgressBar
pbar = ProgressBar()
pbar.register()
#windows authentication + fast_executemany=True
to_sql_uri = sa.create_engine('mssql+pyodbc://TEST-BI/DB_TEST?driver=SQL Server?Trusted_Connection=yes', fast_executemany=True)
ddf.to_sql('test', uri=to_sql_uri, if_exists='replace', index=False)
ValueError: Expected URI to be a string, got <class 'sqlalchemy.engine.base.Engine'>.
i've tried forcing the conversion of to_sql_uri to string but still get an error.
Upvotes: 0
Views: 2606
Reputation: 3
Just letting know that dask.dataframe.to_sql
now (at least version 2023.12.1) accept engine_kwargs
parameter (Pull Request #8609) that can be used to provide underlying SQLAlchemy Engine with arguments. So in case you want to use dask.dataframe.to_sql
with fast_executemany
driver option, you now can do the following:
ddf.to_sql('test',
uri='mssql+pyodbc://TEST-BI/DB_TEST?driver=SQL Server?Trusted_Connection=yes',
if_exists='replace', index=False, engine_kwargs={"fast_executemany": True})
Upvotes: 0
Reputation: 28683
I'm sorry if the documentation is not clear enough (docs page, api reference). However, you did not pass a string in your call, you passed an engine instance, which is what the error message says.
You should do
ddf.to_sql('test',
uri='mssql+pyodbc://TEST-BI/DB_TEST?driver=SQL Server?Trusted_Connection=yes',
if_exists='replace', index=False)
Upvotes: 2