Reputation: 399
I have a query taking about 36 seconds to run, this is unviable, I've tried using select_related and prefetch, but the results don't improve.
This my query:
job_list = Job.objects.filter(Q(hiring_manager=emp) | Q(
followers=emp)).values_list('pk', flat=True)
emp is:
emp = Employee.objects.filter(
user=self.request.user, firm=obj).first().pk
Then i use in:
_candidates = Candidate.objects.filter(
job__in=job_list,
apply_date__range=[date_initial, date_end]
).order_by('-apply_date')
And ater:
_candidates.count()
This SQL:
SQL Executed:
SELECT ...
FROM `combo_candidate` WHERE (
`combo_candidate`.`job_id` IN (
SELECT ... FROM `combo_job` U0
LEFT OUTER JOIN `combo_job_followers` U2 ON U0.`id` = U2.`job_id`
WHERE U0.`hiring_manager_id` = 9482 OR U2.`user_id` = 9482
AND `combo_candidate`.`apply_date` BETWEEN '2019-01-01 02:00:00' AND '2019-02-01 02:00:00'
)
Time
33402,8768539 ms
hiring_manager is a FK in Job, followers is m2m in model Job
Upvotes: 2
Views: 152
Reputation: 2090
Excellent suggestion by @WillemVanOnsem. I am just attaching the Django documentation.
As per the Django documentation
Performance considerations
Be cautious about using nested queries and understand your database server’s performance characteristics (if in doubt, benchmark!). Some database backends, most notably MySQL, don’t optimize nested queries very well. It is more efficient, in those cases, to extract a list of values and then pass that into the second query. That is, execute two queries instead of one:
values = Blog.objects.filter( name__contains='Cheddar').values_list('pk', flat=True) entries = Entry.objects.filter(blog__in=list(values))
Note the list() call around the Blog QuerySet to force execution of the first query. Without it, a nested query would be executed, because QuerySets are lazy.
I place special emphasis on the "Note the list() call around the Blog QuerySet to force execution of the first query. Without it, a nested query would be executed, because QuerySets are lazy."
Hope it helps
Upvotes: 2
Reputation: 476503
I think the reason that this takes so long is because the database runs the subquery for each element in your combo_candidate
table, and thus if the number of candidates is high, then it will make a lot of subqueries.
The subquery is however "static": it does not require to be evaluated multiple times, since nothing in the subquery refers to the world "outside" the query. We thus can make two queries here: one where evaluate the subquery, and one where we evaluate the outer query.
We can force evaluation, for example by using list(..)
:
_candidates = Candidate.objects.filter(
job__in=list(job_list),
apply_date__range=[date_initial, date_end]
).order_by('-apply_date')
Upvotes: 2