Reputation: 609
I have a "dataframe" of 400 rows.
When I try to push the data to SQL Server (on azure VM), its taking about 4 minutes for an insert.
Below is the exceutemany
statement I am using:
cursor.executemany(insert_query,df)
I am aware that cursor.executemany
does row by row insert but for just 400 rows its a bit strange for it to be taking 4 minutes.
Could you please help me, as in, what I can do to troubleshoot this issue?
Upvotes: 0
Views: 916
Reputation: 16431
Please reference this document Pyodbc Cursor`. You can executemany(sql, *params), with fast_executemany=True.
Executes the SQL statement for the entire set of parameters, returning None. The single params parameter must be a sequence of sequences, or a generator of sequences.
params = [ ('A', 1), ('B', 2) ]
cursor.fast_executemany = True
cursor.executemany("insert into t(name, id) values (?, ?)", params)
Here, all the parameters are sent to the database server in one bundle (along with the SQL statement), and the database executes the SQL against all the parameters as one database transaction. Hence, this form of executemany() should be much faster than the default executemany(). However, there are limitations to it, see fast_executemany for more details.
Please modify your code and test again:
cursor.fast_executemany = True
cursor.executemany(insert_query,df)
Hope this helps.
Upvotes: 2