Reputation: 24651
I have a query that returns many rows and each row is large due to the presence of BLOB
s that I need to fetch.
I make the query as follows:
import pymysql
db = pymysql.connect(...)
with db.cursor() as cur:
cur.execute("select value from my_blobs")
for row in cur:
...
I was naively expecting that iterating over cur
, rather than calling fetchall()
, would avoid getting all the data at once and running out of memory in the process. However, what I see is that memory runs out during the call to cur.execute()
, that is, before I even try to get the results through fetchall()
, fetchone()
or by iterating over cur
.
My question is, how can I get my blobs one by one? Am I bound to iterate in the application and make a new query for each blob? Then what is the use of fetchone
or iterating over cur
if the whole data is queried during execute()
anyway?
Upvotes: 2
Views: 1533
Reputation: 2453
Default cursor class is buffering data. Fortunately there is unbuffered version: pymysql.cursors.SSCursor
. Try to run db.cursor(pymysql.cursors.SSCursor)
instead of db.cursor()
.
From the API reference for SSCursor
:
Unbuffered Cursor, mainly useful for queries that return a lot of data, or for connections to remote servers over a slow network.
Instead of copying every row of data into a buffer, this will fetch rows as needed. The upside of this is the client uses much less memory, and rows are returned much faster when traveling over a slow network or if the result set is very big.
Upvotes: 3