gomeslhlima
gomeslhlima

Reputation: 141

to_sql() inserting row by row despite fast_executemany=True in create_engine()

I have a scenario where I need to do a bulk insert between tables, and I configured my sqlalchemy engine to do that. I configured to do a bulk insert in each chunk( chunksize: 10000 ), but I still doing inserts row by row.

destengine = sqlalchemy.create_engine(
               "mssql+pyodbc://"+destuser+":"+destpass+"@"+destaddress+""+destdatabase+"?driver=ODBC+Driver+17+for+SQL+Server", echo=False, fast_executemany = True)

for chunk in pd.read_sql( selectquery, srcconnection, chunksize=100000):
    chunk.to_sql("MyTable", destengine, if_exists='append', index = False )

I selected the ODBC Driver to be the ODBC Driver 17 in my connection string as above, I put fast_executemany with true in my engine. What am I doing wrong? My table have millions and millions of rows and more than 100 columns. I am new in Pandas, but I think that I am doing equal to the examples.

Upvotes: 2

Views: 885

Answers (1)

Gord Thompson
Gord Thompson

Reputation: 123829

You have hit upon a variation of this:

fast_executemany ineffective when generator passed to executemany

The pandas docs state that if chunksize= is passed to read_sql() then it returns an iterator instead of a DataFrame. We can work around that by pulling the results into a "real" DataFrame and then calling to_sql() on that:

for chunk in pd.read_sql( selectquery, srcconnection, chunksize=10):
    df = pd.DataFrame(chunk.values, columns=chunk.columns)
    df.to_sql("MyTable", destengine, if_exists='append', index = False )

Upvotes: 1

Related Questions