Reputation: 11068
I'm in a situation where I must output a quite large list of objects by a CharField used to store street addresses.
My problem is, that obviously the data is ordered by ASCII codes since it's a Charfield, with the predictable results .. it sort the numbers like this;
1, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 2, 20, 21....
Now the obvious step would be to change the Charfield the proper field type (IntegerField let's say), however it cannot work since some address might have apartments .. like "128A".
I really don't know how I can order this properly ..
Upvotes: 33
Views: 29233
Reputation: 1
The post before last helped me a lot in Netbox. I just can't reply because my account is still too young :D https://stackoverflow.com/a/70153906/27018495 THX @Dr Manhattan
I have changed it a bit. Because I want to count up the inventory numbers that start with a sequence of letters. The highest number is that of the predecessor.
Example alphabetical sorting:
Next Result: SP40003
###################################
# DE: Ist kein Asset-Tag angegeben, wird das höchste Asset-Tag beginnend mit SP gesucht und um 1 erhöht. Gibt es noch kein Asset Tag so wird mit 1 gestartet.
# EN: If no asset tag is specified, the highest asset tag starting with SP is searched for and incremented by 1. If there is no asset tag yet, it is started with 1.
# THX: https://github.com/netbox-community/netbox/discussions/9226
# https://stackoverflow.com/a/70153906/27018495
class CreateAssetTag(CustomValidator):
def validate(self, obj):
from django.db.models.expressions import RawSQL
from operator import attrgetter
prefix = "SP" # Text that should appear in the asset name on the left
max_digits = 0 # Number of leading zeros in the count
if not obj.asset_tag:
compareObj = obj.__class__.objects.filter(asset_tag__iregex=r'^' + prefix + r'(\d+)$')\
.all().annotate(theLastAssetId=RawSQL("CAST(regexp_replace(asset_tag, '[^0-9]+', '', 'g') AS INTEGER)", '')) #.order_by('-theLastAssetId').first() # *Alternative to the max function
maxObj = max(compareObj, key=attrgetter('theLastAssetId')) # Entry with the highest asset value
if not compareObj:
# Returns 1 if no comparison object has an asset tag yet.
obj.asset_tag = prefix + "1".zfill(max_digits)
return
try:
next_id = maxObj.theLastAssetId + 1
#next_id = compareObj.theLastAssetId + 1 # *Alternative to the max function
obj.asset_tag = prefix + str(next_id).zfill(max_digits)
except ValueError:
# If the ID cannot be parsed, nothing happens (appropriate error handling can be added)
return
Upvotes: 0
Reputation: 3387
I have a similar situation. An old database has a table with mostly numeric data in a column (named page
) with type varchar
. I need to sort the results queried from that table in a natural numeric order, but I don't need the data type changed in the results. I ended up converting the column value to a number for the sorting only:
(ItemPage.objects.filter(volume__id=volumeId)
.order_by('item__topic__name', F('page') * 1, 'item__name'))
I see some other answers here indicate that MySQL's CAST()
may not work with strings containing a mixture of digitis and alphabetics, but from experiments, I don't think that's true. Maybe it's a problem for older versions of MySQL.
Whether that's a problem or not, I decided to multiply the value by one instead. It works well for me. At some point, I'll experiment with CAST()
and see whether it always works and does it execute faster than multiplying by one.
Upvotes: 0
Reputation: 14037
All the answeres in this thread did not work for me because they are assuming numerical text. I found a solution that will work for a subset of cases. Consider this model
Class Block(models.Model):
title = models.CharField()
say I have fields that sometimes have leading characters and trailing numerical characters If i try and order normally
>>> Block.objects.all().order_by('title')
<QuerySet [<Block: 1>, <Block: 10>, <Block: 15>, <Block: 2>, <Block: N1>, <Block: N12>, <Block: N4>]>
As expected, it's correct alphabetically, but makes no sense for us humans. The trick that I did for this particular use case is to replace any text i find with the number 9999 and then cast the value to an integer and order by it.
for most cases that have leading characters this will get the desired result. see below
from django.db.models.expressions import RawSQL
>>> Block.objects.all()\
.annotate(my_faux_integer=RawSQL("CAST(regexp_replace(title, '[A-Z]+', '9999', 'g') AS INTEGER)", ''))\
.order_by('my_faux_integer', 'title')
<QuerySet [<Block: 1>, <Block: 2>, <Block: 10>, <Block: 15>, <Block: N1>, <Block: N4>, <Block: N12>]>
Upvotes: 0
Reputation: 375
I know that I’m late on this, but since it’s strongly related to the question, and that I had a hard time finding this:
You have to know that you can directly put the Cast
in the ordering
option of your model.
from django.db import models
from django.db.models.functions import Cast
class Address(models.Model):
street_number = models.CharField()
class Meta:
ordering = [
Cast("street_number", output_field=models.IntegerField()),
]
From the doc about ordering:
You can also use query expressions.
And from the doc about database functions:
Functions are also expressions, so they can be used and combined with other expressions like aggregate functions.
Upvotes: 5
Reputation: 2400
Django is trying to deprecate the extra()
method, but has introduced Cast()
in v1.10. In sqlite (at least), CAST
can take a value such as 10a
and will cast it to the integer 10
, so you can do:
from django.db.models import IntegerField
from django.db.models.functions import Cast
MyModel.objects.annotate(
my_integer_field=Cast('my_char_field', IntegerField())
).order_by('my_integer_field', 'my_char_field')
which will return objects sorted by the street number first numerically, then alphabetically, e.g. ...14, 15a, 15b, 16, 16a, 17...
Upvotes: 23
Reputation: 714
In my case i have a CharField with a name field, which has mixed (int+string) values, for example. "a1", "f65", "P", "55" e.t.c ..
Solved the issue by using the sql cast (tested with postgres & mysql), first, I try to sort by the casted integer value, and then by the original value of the name field.
parking_slots = ParkingSlot.objects.all().extra(
select={'num_from_name': 'CAST(name AS INTEGER)'}
).order_by('num_from_name', 'name')
This way, in any case, the correct sorting works for me.
Upvotes: 2
Reputation: 3734
I was looking for a way to sort the numeric chars in a CharField
and my search led me here. The name
fields in my objects are CC Licenses, e.g., 'CC BY-NC 4.0'.
Since extra()
is going to be deprecated, I was able to do it this way:
MyObject.objects.all()
.annotate(sorting_int=Cast(Func(F('name'), Value('\D'), Value(''), Value('g'), function='regexp_replace'), IntegerField()))
.order_by('-sorting_int')
Thus, MyObject
with name='CC BY-NC 4.0'
now has sorting_int=40
.
Upvotes: 1
Reputation: 31555
In case you need to sort version numbers consisting of multiple numbers separated by a dot (e.g. 1.9.0, 1.10.0
), here is a postgres-only solution:
class VersionRecordManager(models.Manager):
def get_queryset(self):
return super().get_queryset().extra(
select={
'natural_version': "string_to_array(version, '.')::int[]",
},
)
def available_versions(self):
return self.filter(available=True).order_by('-natural_version')
def last_stable(self):
return self.available_versions().filter(stable=True).first()
class VersionRecord(models.Model):
objects = VersionRecordManager()
version = models.CharField(max_length=64, db_index=True)
available = models.BooleanField(default=False, db_index=True)
stable = models.BooleanField(default=False, db_index=True)
In case you want to allow non-numeric characters (e.g. 0.9.0 beta
, 2.0.0 stable
):
def get_queryset(self):
return super().get_queryset().extra(
select={
'natural_version':
"string_to_array( "
" regexp_replace( " # Remove everything except digits
" version, '[^\d\.]+', '', 'g' " # and dots, then split string into
" ), '.' " # an array of integers.
")::int[] "
}
)
Upvotes: 2
Reputation: 2329
If you're using PostgreSQL (not sure about MySQL) you can safely use following code on char/text fields and avoid cast errors:
MyModel.objects.extra(
select={'myinteger': "CAST(substring(charfield FROM '^[0-9]+') AS INTEGER)"}
).order_by('myinteger')
Upvotes: 19
Reputation: 75
Great tip! It works for me! :) That's my code:
revisioned_objects = revisioned_objects.extra(select={'casted_object_id': 'CAST(object_id AS INTEGER)'}).extra(order_by = ['casted_object_id'])
Upvotes: 3
Reputation: 599590
If you're sure there are only integers in the field, you could get the database to cast it as an integer via the extra
method, and order by that:
MyModel.objects.extra(
select={'myinteger': 'CAST(mycharfield AS INTEGER)'}
).order_by('myinteger')
Upvotes: 29
Reputation: 24966
The problem you're up against is quite similar to how filenames get ordered when sorting by filename. There, you want "2 Foo.mp3" to appear before "12 Foo.mp3".
A common approach is to "normalize" numbers to expanding to a fixed number of digits, and then sorting based on the normalized form. That is, for purposes of sorting, "2 Foo.mp3" might expand to "0000000002 Foo.mp3".
Django won't help you here directly. You can either add a field to store the "normalized" address, and have the database order_by
that, or you can do a custom sort in your view (or in a helper that your view uses) on address records before handing the list of records to a template.
Upvotes: 2