Reputation: 189
I'm having issues quickly inserting large volumes of data from Python3 into SQL Server.
The target table has 9 columns with 3 indexes and 1 primary key.
The below code works but it's a lot slower than I would like. See timing below:
-- 1,000 records
In [35]: %time connection_factory.executemany(sql, args)
CPU times: user 30.2 ms, sys: 40.9 ms, total: 71.1 ms
Wall time: 3.54 s
-- 5,000 records
In [46]: %time connection_factory.executemany(sql, args)
CPU times: user 110 ms, sys: 55.8 ms, total: 166 ms
Wall time: 17 s
I've tried using sql_alchemy and am currently using Turbodbc - but open to anything else that works faster.
Below is a sample of my code
from turbodbc import connect, make_options
class ConnectionFactory:
def __init__(self):
self.connection = self.initialize()
@staticmethod
def initialize():
options = make_options(autocommit=True)
return connect(driver="FREETDS",
server="",
port="",
database="",
uid="",
pwd="",
turbodbc_options=options)
def execute(self, query, params=None):
try:
cursor = self.connection.cursor()
cursor.execute(query, params)
except Exception as e:
print(e)
finally:
cursor.close()
return
def executemany(self, query, params=None):
try:
cursor = self.connection.cursor()
cursor.executemany(query, params)
except Exception as e:
print(e)
finally:
cursor.close()
return
sql = """
INSERT INTO table1 (value1,
value2,
value3,
value4,
value5,
value6,
value7)
VALUES (?, ?, ?, ?, ?, ?, ?); """
args = df.to_records().tolist()
connection_factory = ConnectionFactory()
connection_factory.executemany(sql, args)
Is anyone familiar with this exact combination of SQL Server and python that could point me in the right direction?
Upvotes: 1
Views: 4770
Reputation: 13016
I can see that you already have function for execute(). it should be faster same to bulk insert.
args= ', '.join(map(str, df.to_records().tolist()))
sql = "
INSERT INTO table1 (value1,
value2,
value3,
value4,
value5,
value6,
value7)
VALUES {}".format(args)
connection_factory = ConnectionFactory()
connection_factory.execute(sql)
Create new method to execute query from string without params.
def execute2(self, query):
try:
cursor = self.connection.cursor()
cursor.execute(query)
except Exception as e:
print(e)
finally:
cursor.close()
return
Upvotes: 0
Reputation: 567
Sorry, my mistake, I posted information about mySQL. You're looking for msSQL.
Here is an equivalent bulk insert statement for msSQL:
BULK INSERT MyTable
FROM 'path\myfile.csv'
WITH
(FIELDTERMINATOR = ';',
ROWTERMINATOR = '\n')
There are a few options:
You may write your data to a .csv file and then leverage mySql's very fast LOAD DATA INFILE
command.
OR
You may also use another form of the insert command, which is:
INSERT INTO tbl_name
(a,b,c)
VALUES
(1,2,3),
(4,5,6),
(7,8,9);
See these optimization links:
Upvotes: 0