Reputation: 3220
I have to deal with a large result set (could be hundreds thousands of rows, sometimes more).
They unfortunately need to be retrieved all at once (on start up).
I'm trying to do that by using as less memory as possible.
By looking on SO I've found that using SSCursor
might be what I'm looking for, but I still don't really know how to exactly use them.
Is doing a fetchall()
from a base cursor or a SScursor the same (in term of memory usage)?
Can I 'stream' from the sscursor my rows one by one (or a few by a few), and if yes, what is the most efficient way to do so?
Upvotes: 37
Views: 20813
Reputation: 12461
Alternatively, you can use SSCursor
outside the connection object (it is pretty important when you already define connection and dont want all the connection use SSCursor
as a cursorclass).
import MySQLdb
from MySQLdb.cursors import SSCursor # or you can use SSDictCursor
connection = MySQLdb.connect(
host=host, port=port, user=username, passwd=password, db=database)
cursor = SSCursor(connection)
cursor.execute(query)
for row in cursor:
print(row)
Upvotes: 4
Reputation: 28268
Definitely use the SSCursor when fetching big result sets. It made a huge difference for me when I had a similar problem. You can use it like this:
import MySQLdb
import MySQLdb.cursors
connection = MySQLdb.connect(
host=host, port=port, user=username, passwd=password, db=database,
cursorclass=MySQLdb.cursors.SSCursor) # put the cursorclass here
cursor = connection.cursor()
Now you can execute your query with cursor.execute()
and use the cursor as an iterator.
Edit: removed unnecessary homegrown iterator, thanks Denis!
Upvotes: 16
Reputation: 879371
I am in agreement with Otto Allmendinger's answer, but to make explicit Denis Otkidach's comment, here is how you can iterate over the results without using Otto's fetch() function:
import MySQLdb.cursors
connection=MySQLdb.connect(
host="thehost",user="theuser",
passwd="thepassword",db="thedb",
cursorclass = MySQLdb.cursors.SSCursor)
cursor=connection.cursor()
cursor.execute(query)
for row in cursor:
print(row)
Upvotes: 39