Reputation: 1002
I would like to insert multiple rows with one insert statement.
I tried with
params = ((1, 2), (3,4), (5,6))
sql = 'insert into tablename (column_name1, column_name2) values (?, ?)'
cursor.fast_executemany = True
cursor.executemany(sql, params)
but it's simple loop on the params with running execute method under the hood.
I also tried with creating longer insert statement to be like INSERT INTO tablename (col1, col2) VALUES (?,?), (?,?)...(?,?).
def flat_map_list_of_tuples(list_of_tuples):
return [element for tupl in list_of_tuples for element in tupl])
args_str = ', '.join('(?,?)' for x in params)
sql = 'insert into tablename (column_name1, column_name2) values'
db.cursor.execute(sql_template + args_str, flat_map_list_of_tuples(params))
It worked and reduced time of insertion from 10.9s to 6.1.
Is this solution correct? Does it have some vulnerabilities?
Upvotes: 0
Views: 7030
Reputation: 123654
Is this solution correct?
The solution you propose, which is to build a table value constructor (TVC), is not incorrect but it is really not necessary. pyodbc with fast_executemany=True
and Microsoft's ODBC Driver 17 for SQL Server is about as fast as you're going to get short of using BULK INSERT
or bcp
as described in this answer.
Does it have some vulnerabilities?
Since you are building a TVC for a parameterized query you are protected from SQL Injection vulnerabilities, but there are still a couple of implementation considerations:
A TVC can insert a maximum of 1000 rows at a time.
pyodbc executes SQL statements by calling a system stored procedure, and stored procedures in SQL Server can accept a maximum of 2100 parameters, so the number of rows that your TVC can insert is also limited to (number_of_rows * number_of_columns < 2100).
In other words, your TVC approach will be limited to a "chunk size" of 1000 rows or less. The actual calculation is described in this answer.
Upvotes: 4