Reputation: 15871
I am aware that regular queryset or the iterator queryset methods evaluates and returns the entire data-set in one shot .
for instance, take this :
my_objects = MyObject.objects.all()
for rows in my_objects: # Way 1
for rows in my_objects.iterator(): # Way 2
Question
In both methods all the rows are fetched in a single-go.Is there any way in djago that the queryset rows can be fetched one by one from database.
Why this weird Requirement
At present my query fetches lets says n rows but sometime i get Python and Django OperationalError (2006, 'MySQL server has gone away').
so to have a workaround for this, i am currently using a weird while
looping logic.So was wondering if there is any native or inbuilt method or is my question even logical in first place!! :)
Upvotes: 5
Views: 511
Reputation: 53734
Fetching row by row might be worse. You might want to retrieve in batches for 1000s etc. I have used this Django snippet (not my work) successfully with very large querysets. It doesn't eat up memory and no trouble with connections going away.
Here's the snippet from that link:
import gc
def queryset_iterator(queryset, chunksize=1000):
'''''
Iterate over a Django Queryset ordered by the primary key
This method loads a maximum of chunksize (default: 1000) rows in it's
memory at the same time while django normally would load all rows in it's
memory. Using the iterator() method only causes it to not preload all the
classes.
Note that the implementation of the iterator does not support ordered query sets.
'''
pk = 0
last_pk = queryset.order_by('-pk')[0].pk
queryset = queryset.order_by('pk')
while pk < last_pk:
for row in queryset.filter(pk__gt=pk)[:chunksize]:
pk = row.pk
yield row
gc.collect()
Upvotes: 2
Reputation: 1015
To solve (2006, 'MySQL server has gone away') problem, your approach is not that logical. If you will hit database for each entry, it is going to increase number of queries which itself will create problem in future as usage of your application grows. I think you should close mysql connection after iterating all elements of result, and then if you will try to make another query, django will create a new connection.
from django.db import connection:
connection.close()
Upvotes: 1
Reputation: 43300
I think you are looking to limit your query set.
Quote from above link:
Use a subset of Python’s array-slicing syntax to limit your QuerySet to a certain number of results. This is the equivalent of SQL’s LIMIT and OFFSET clauses.
In other words, If you start with a count you can then loop over and take slices as you require them..
cnt = MyObject.objects.count()
start_point = 0
inc = 5
while start_point + inc < cnt:
filtered = MyObject.objects.all()[start_point:inc]
start_point += inc
Of course you may need to error handle this more..
Upvotes: 3