Florian
Florian

Reputation: 69

Django - Multi filtering queryset return empty queryset

I have a problem with queryset in Django 2.0, after some research, I don't find any problem looks like mine.

I think it's because of my very old legacy database create by someone I didn't know.

So, I have a sqlite database who looks like this:

Has you can see, the Table Properties don't have primary_key, so i made a models with django inspectdb command who looks like this:

from django.db import models

class Record(models.Model):
    id = models.IntegerField(db_column='ID', primary_key=True)

    class Meta:
        db_table = 'Records'

    def __str__(self):
        return "%s" % self.id


class Propertie(models.Model):
    id = models.ForeignKey(Record, models.DO_NOTHING, db_column='ID', primary_key=True)
    item = models.CharField(db_column='Item', max_length=500)
    value = models.CharField(db_column='Value', max_length=500)

    class Meta:
        db_table = 'Properties'

    def __str__(self):
        return '[%s]- %s -> %s' % (self.item, self.value, self.id)

I set Properties.id as primary_key but it's a ForeignKey and Django say to set this field as OneToOneField and it's normal and logic, but 1 Record is linked to 9 Properties so Porpertie.id can't be unique this is my first problem because I can't alter the database.

My second and real problem is when I run this query:

def my_view(request):

   epoch = datetime.date(1970, 1, 1)
   period_from = stat_form.cleaned_data.get("period_from")
   period_to = stat_form.cleaned_data.get("period_to")
   product = stat_form.cleaned_data.get("kit")

   timestamp_from = period_from - epoch
   timestamp_to = period_to - epoch

   records = Record.objects.using("statool").filter(
        propertie__item="product",
        propertie__value=product,
    ).filter(
        propertie__item="stamp",
        propertie__value__gt=str(int(timestamp_from.total_seconds())),
        propertie__value__lt=str(int(timestamp_to.total_seconds())),
    ).count()

this QuerySet is empty but it should return approximately 16XXX Record I don't know what happens?

Because if I do this query:

  records = Record.objects.using("statool").filter(
        propertie__item="product",
        propertie__value=product,
  )

It returns a result but the second filter doesn't work ...

The goal of those request is to get the Record out with the specific date and product name.

the 9 possibilities of item field in Properties can be:

A future query with the same logic will be applied just after to get version by product and by site.

Thank you for your help! And sorry for my bad English :)

Upvotes: 0

Views: 1364

Answers (1)

Florian
Florian

Reputation: 69

To answer my problem,

first i have stoped to try user multi .filter because when i run:

records = Record.objects.using("statool").filter(
    propertie__item="product",
    propertie__value=product,
).filter(
    propertie__item="stamp",
    propertie__value__gt=str(int(timestamp_from.total_seconds())),
    propertie__value__lt=str(int(timestamp_to.total_seconds())),
).count()

After the first .filterRecord objects lost reference to propertie_set so i can't filter by propertie.

As say @ukemi and @Ralf, using:

.filter(
    propertie__item="stamp",
    propertie__value__gt=str(int(timestamp_from.total_seconds())),
    propertie__value__lt=str(int(timestamp_to.total_seconds())),
)

is a really bad idea to have exact query.

So this is my solution:

def select_stats(request):
    epoch = datetime.date(1970, 1, 1)
    period_from = stat_form.cleaned_data.get("period_from")
    period_to = stat_form.cleaned_data.get("period_to")
    product = stat_form.cleaned_data.get("kit")

    timestamp_from = period_from - epoch
    timestamp_to = period_to - epoch
    timestamp_from = int(timestamp_from.total_seconds())
    timestamp_to = int(timestamp_to.total_seconds())

    all_product = Propertie.objects.using("statool").filter(
        item="product",
        value=product
    ).values_list("id", flat=True)

    all_stamp = Propertie.objects.using("statool").annotate(
        date=Cast("value", IntegerField())
    ).filter(
        date__gte=timestamp_from,
        date__lt=timestamp_to
    ).values_list("id", flat=True)

    all_records = Record.objects.using("statool").filter(
        id__in=all_product.intersection(all_stamp)
    )

    all_recorded_propertie = Propertie.objects.using("statool").filter(id__in=all_records)

    all_version = all_recorded_propertie.filter(
        id__in=all_records,
        item="version"
    ).values_list("value", flat=True).distinct()

    all_site = all_recorded_propertie.filter(
        id__in=all_records,
        item="site"
    ).values_list("value", flat=True).distinct()

    stats_site = {}
    for version in all_version:
        stats_site[version] = {}
        id_version = all_recorded_propertie.filter(
            item="version",
            value=version
        ).values_list("id", flat=True)
        for site in all_site:
            id_site = all_recorded_propertie.filter(
                item="site", 
                value=site
            ).values_list("id", flat=True)
            stats_site[version][site] = id_version.intersection(id_site).count()

To solve timestamp problem by this way:

all_stamp = Propertie.objects.using("statool").annotate(
    date=Cast("value", IntegerField())
).filter(
    date__gte=timestamp_from,
    date__lt=timestamp_to
).values_list("id", flat=True)

Thank's to @erikreed from this thread: Django QuerySet Cast

By the way, this is the most efficient way i've found to do my job.
But if we run this view we have this runtime: view query runtime

As you can see, every QuerySet are very fast, but intersections between version.id and site.id are very long (more than 2 minutes).

If someone know a better way to do those query, just let us know :)
Hope I help someone.

Upvotes: 1

Related Questions