Reputation: 1226
I run a dog salon where dogs get haircuts on an infrequent basis. In order to encourage owners back I would like to send out vouchers for their next visit. The voucher will be based on whether a dog has had a haircut within the last 2 months to 2 years. Beyond 2 years ago we can assume that the customer has been lost and less than 2 months ago is too close to their previous haircut. We will first target owners that have recently visited.
My underlying database is PostgreSQL.
from datetime import timedelta
from django.db import models
from django.db.models import Max, OuterRef, Subquery
from django.utils import timezone
# Dogs have one owner, owners can have many dogs, dogs can have many haircuts
class Owner(models.model):
id = models.UUIDField(primary_key=True, default=uuid.uuid4, editable=False)
name = models.CharField(max_length=255)
class Dog(models.model):
id = models.UUIDField(primary_key=True, default=uuid.uuid4, editable=False)
owner = models.ForeignKey(Owner, on_delete=models.CASCADE, related_name="dogs")
name = models.CharField(max_length=255)
class Haircut(models.model):
id = models.UUIDField(primary_key=True, default=uuid.uuid4, editable=False)
dog = models.ForeignKey(Dog, on_delete=models.CASCADE, related_name="haircuts")
at = models.DateField()
today = timezone.now().date()
start = today - timedelta(years=2)
end = today - timedelta(months=2)
It strikes me that the problem can be broken down into two queries. The first is something that aggregates an owner's dogs to most recently cut within the last 2 months to 2 years.
dog_aggregate = Haircut.objects.annotate(Max("at")).filter(at__range=(start, end))
And then joins the result of that to the owners table.
owners_by_shaggiest_dog_1 = Owner.objects # what's the rest of this?
Resulting in SQL similar to:
select
owner.id,
owner.name
from
(
select
dog.owner_id,
max(haircut.at) last_haircut
from haircut
left join dog on haircut.dog_id = dog.id
where
haircut.at
between current_date - interval '2' year
and current_date - interval '2' month
group by
dog.owner_id
) dog_aggregate
left join owner on dog_aggregate.owner_id = owner.id
order by
dog_aggregate.last_haircut asc,
owner.name;
Through some playing around I have managed to get the correct result with:
haircut_annotation = Subquery(
Haircut.objects
.filter(dog__owner=OuterRef("pk"), at__range=(start, end))
.order_by("-at")
.values("at")[:1]
)
owners_by_shaggiest_dog_2 = (
Owner.objects
.annotate(last_haircut=haircut_annotation)
.order_by("-last_haircut", "name")
)
However, the resulting SQL seems inefficient as a new query is performed for every row:
select
owner.id,
owner.name,
(
select
from haircut
inner join dog on haircut.dog_id = dog.id
where haircut.at
between current_date - interval '2' year
and current_date - interval '2' month
and dog.owner_id = (owner.id)
order by
haircut.at asc
limit 1
) last_haircut
from
owner
order by
last_haircut asc,
owner.name;
P.S. I don't actually run a dog salon so I can't give you a voucher. Sorry!
Upvotes: 2
Views: 525
Reputation: 476534
Given I understood it correctly, you can make a query like:
from django.db.models import Max
Owners.objects.filter(
dogs__haircuts__at__range=(start, end)
).annotate(
last_haircut=Max('dogs__haircuts__at')
).order_by('last_haircut', 'name')
The last haircut should be the Max
imum here, since as time passes by, the timestamp is larger.
Note however that your query and this query, does not exclude owners of dogs that have been washed more recently. We simply do not take that into account when we calculate the last_haircut
.
If you want to exclude such owners, you should build a query like:
from django.db.models import Max
Owners.objects.exclude(
dogs__haircuts__at__gt=end
).filter(
dogs__haircuts__at__range=(start, end)
).annotate(
last_haircut=Max('dogs__haircuts__at')
).order_by('last_haircut', 'name')
Upvotes: 1