John Smith
John Smith

Reputation: 115

Django: Fastest way to random query one record using filter

What is the fastest way to query one record from database that is satisfy my filter query.

mydb.objects.filter(start__gte='2017-1-1', status='yes').order_by('?')[:1]

This statement will first query thousands of records and then select one, and it is very slow, but I only need one, a random one. what is the fastest one to get?

Upvotes: 1

Views: 442

Answers (2)

badiya
badiya

Reputation: 2257

Using order_by('?') will cause you a great performance issue. A better way is to use something like this: Getting a random row from a relational database.

count = mydb.objects.filter(start__gte='2017-1-1', status='yes').aggregate(count=Count('id'))['count']
random_index = randint(0, count - 1)
result= mydb.objects.filter(start__gte='2017-1-1', status='yes')[random_index]

Upvotes: 3

cosinepenguin
cosinepenguin

Reputation: 1575

Well, I'm not sure you will be able to do exactly what you want. I was running into a similar issue a few months ago and I ended up redesigning my implementation of my backend to make it work.

Essentially, you want to make the query time shorter by having it choose a random record that fulfills both requirements (start__gte='2017-1-1', status='yes'), but like you say in order for the query to do so, it needs to filter your entire database. This means that you can cannot get a "true" random record from the database that also fulfills the filter requirements, because filtering inherently needs to look through all of your records (otherwise it wouldn't be really random, it would just be the first one it finds that fulfills your requirements).

Instead, consider putting all records that have a status='yes' in a separate relation, so that you can pull a random record from there and join with the larger relation. That would make the query time considerably faster (and it's the type of solution I implemented to get my code to work).

If you really want a random record with the correct filter information, you might need to employ some convoluted means.

You could use a custom manager in Django to have it find only one random record, something like this:

class UsersManager(models.Manager):
    def random(self):
        count = self.aggregate(count=Count('id'))['count']
        random_index = randint(0, count - 1)
        return self.all()[random_index]

class User(models.Model):
    objects = UsersManager()
    #Your fields here (whatever they are, it seems start__gte and status are some)!
    objects = UserManager()

Which you can invoke then just by using:

User.objects.random()

This could be repeated with a check in your code until it returns a random record that fulfills your requirements. I don't think this is necessarily the cleanest or programmatically correct way of implementing this, but I don't think a faster solution exists for your specific issue.

I used this site as a source for this answer, and it has a lot more solid information about using this custom random method! You'll likely have to change the custom manager to serve your own needs, but if you add the random() method to your existing custom manager it should be able to do what you need of it!

Hope it helps!

Upvotes: 3

Related Questions