Reputation: 139
I am using pyodbc and I want to return lots of data from different tables. What is the most efficient way to do this? Do I have to create a new connection each time or can I reuse one?
import pyodbc
def docLauncher(connetStr):
conn = pyodbc.connect(connetStr)
cursor = conn.cursor()
cursor.execute('SELECT COLUMN1 FROM TABLE1')
companyName = cursor.fetchone()[0]
conn1 = pyodbc.connect(connetStr)
cursor1 = conn1.cursor()
cursor1.execute('SELECT COLUMN2 FROM TABLE2')
ver = cursor1.fetchone()[0]
print(companyName)
print(ver)
Upvotes: 2
Views: 7449
Reputation: 100
You do not have to create new connections, just one is sufficient. However, if you're looking to return a lot of data, I would not use the fetch
functions. I have found this to be very time-consuming. This is how I access pyodbc and return millions of rows to work with (you can put this inside your docLauncher() function if you want):
import pyodbc
conn = pyodbc.connect(connection_string)
crsr = conn.cursor()
my_data = crsr.execute(sql_string)
for row in my_data:
# do something with your data here
print(row)
my_data_2 = crsr.execute(sql_string_2)
for row_2 in my_data_2:
print(row_2)
This can be very useful if you want to export your data (to csv, for example). Instead of this:
for row in my_data:
print(row)
You would do this:
with open(file_name, 'w', newline='') as csvfile:
writer = csv.writer(csvfile)
writer.writerow([x[0] for x in crsr.description]) # column headers
for row in my_data:
writer.writerow(row)
Using the for
loop reduces my processing times from over an hour to just a few minutes (~3 minutes for 5 million rows x 30 columns).
Upvotes: 3