Gerard
Gerard

Reputation: 9408

Django - Filter queryset by CharField value length

Given that I have a legacy model with a CharField or CharField-based model field like:

class MyModel(models.Model):
    name = models.CharField(max_length=1024, ...)
    ...

I need to make migrations to make it have a max_length of max. 255. First I'm writing a datamigration to make any values with longer than 255 characters adapt the upcoming schemamigration to fix the maximum length of the column, which I will do right after this works.

The issue is that I have a very very large set of data and I know not all the rows contain a value longer than 255 chars for MyModel.name and I would like to consider for my migration only those who do.

Is there any way (with) the django ORM to filter only the objects that meet this condition? Something like:

MyModel.objects.filter(name__len__gte=255)

would be great, but I believe this is not possible, or at least it's not this straightforward.

Does somebody know any way to accomplish this query?

Thanks!

Upvotes: 16

Views: 10354

Answers (2)

BBT
BBT

Reputation: 461

'Recent' Django versions have the built in django.db.models.functions.Length lookup/transform, so you can do:

MyModel.objects.annotate(length=Length('text')).filter(length__gt=254)

See https://docs.djangoproject.com/en/1.11/ref/models/database-functions/#length

Old answer:

I think you have to options:

Using 'extra' on the queryset:

MyModel.objects.extra(where=["CHAR_LENGTH(text) > 254"])

Or abusing Regex lookups, I'm assuming this will be slower:

MyModel.objects.filter(text__regex = r'^.{254}.*')

Upvotes: 29

Lord Mosuma
Lord Mosuma

Reputation: 368

If you find that you are using a lot of extra and regex, following @BBT's suggestions, I went ahead and implemented the transform as follows:

# utils.db

from django.db.models import Transform
from django.db.models import CharField

class CharacterLength(Transform):
    lookup_name = 'len'
    def as_sql(self, compiler, connection):
        lhs, params = compiler.compile(self.lhs)
        return "LENGTH(%s)" % lhs, params

CharField.register_lookup(CharacterLength)

Thereafter, I could do a cascaded lookup on "mycolname" as follows:

from utils.db import *

queryset.filter(mycolname__len__gte=10)

Upvotes: 7

Related Questions