Reputation: 141
I have a scenario where I need to do a bulk insert between tables, and I configured my sqlalchemy engine to do that. I configured to do a bulk insert in each chunk( chunksize: 10000 ), but I still doing inserts row by row.
destengine = sqlalchemy.create_engine(
"mssql+pyodbc://"+destuser+":"+destpass+"@"+destaddress+""+destdatabase+"?driver=ODBC+Driver+17+for+SQL+Server", echo=False, fast_executemany = True)
for chunk in pd.read_sql( selectquery, srcconnection, chunksize=100000):
chunk.to_sql("MyTable", destengine, if_exists='append', index = False )
I selected the ODBC Driver to be the ODBC Driver 17 in my connection string as above, I put fast_executemany
with true in my engine. What am I doing wrong? My table have millions and millions of rows and more than 100 columns. I am new in Pandas, but I think that I am doing equal to the examples.
Upvotes: 2
Views: 885
Reputation: 123829
You have hit upon a variation of this:
fast_executemany ineffective when generator passed to executemany
The pandas docs state that if chunksize=
is passed to read_sql() then it returns an iterator instead of a DataFrame. We can work around that by pulling the results into a "real" DataFrame and then calling to_sql()
on that:
for chunk in pd.read_sql( selectquery, srcconnection, chunksize=10):
df = pd.DataFrame(chunk.values, columns=chunk.columns)
df.to_sql("MyTable", destengine, if_exists='append', index = False )
Upvotes: 1