Reputation: 48864
I'm trying to sort a list of titles, but currently there's a giant block of titles which start with 'The '. I'd like the 'The ' to be ignored, and the sort to work off the second word. Is that possible in SQL, or do I have to do custom work on the front end?
For example, current sorting:
Would be better sorted:
Almost as if the records were stored as 'Fountain, The', and the like. But I don't want to store them that way if I can, which is of course the crux of the problem.
Upvotes: 17
Views: 9128
Reputation: 11
Ways that will only remove the first The
:
=SUBSTITUTE(A1,"The ","",1) OR more reliably:
=IF(IF(LEFT(A1,4)="The ",TRUE)=TRUE,RIGHT(A1,(LEN(A1)-4)),A1)
Second one is basically saying if the first left digit equals The
, then check how many digits are in the cell, and show only the the right hand digits excluding The
.
Upvotes: 1
Reputation: 1747
For SQLite
ORDER BY CASE WHEN LOWER(SUBSTR(title,1,4)) = 'the ' THEN SUBSTR(title,5) ELSE title END ASC
Upvotes: 1
Reputation: 11739
Something like:
ORDER BY IF(LEFT(title,2) = "A ",
SUBSTRING(title FROM 3),
IF(LEFT(title,3) = "An ",
SUBSTRING(title FROM 4),
IF(LEFT(title,4) = "The ",
SUBSTRING(title FROM 5),
title)))
But given the overhead of doing this more than a few times, you're really better off storing the title sort value in another column...
Upvotes: 9
Reputation: 24966
You can certainly arrange dynamically strip off 'The', though you'll soon find that you have to deal with 'A' and 'An' (except for the special case of titles like "A is for Alibi"). When "foreign" films enter the mix, you'll need to cope with "El" and "La" (except for that pesky edge case, "LA Story"). Then mix in some German films, and you'll need to cope with 'Der' and 'Die' (except for that pesky set of 'Die Hard' edge cases). See the pattern? You're headed down a path that keeps getting longer and more pitted with special cases.
The way forward on this that avoids an ever-growing set of special cases is to store the title as you want it display and store the title as you want it sorted.
Upvotes: 3
Reputation: 15849
Best is to have a computed column to do this, so that you can index the computed column and order by that. Otherwise, the sort will be a lot of work.
So then you can have your computed column as:
CASE WHEN title LIKE 'The %' THEN stuff(title,1,4,'') + ', The' ELSE title END
Edit: If STUFF isn't available in MySQL, then use RIGHT or SUBSTRING to remove the leading 4 characters. But still try to use a computed column if possible, so that indexing can be better. The same logic should be applicable to rip out "A " and "An ".
Rob
Upvotes: 12
Reputation: 332661
The best way to handle this would be to have a column that contains the value you want to use specifically for ordering output. Then you'd just have to use:
SELECT t.title
FROM MOVIES t
ORDER BY t.order_title
There are going to be various rules about what should and should not be used to order titles.
Based on your example, an alternative would be to use something like:
SELECT t.title
FROM MOVIES t
ORDER BY SUBSTR(t.title, INSTR(t.title, 'The '))
You could use a CASE statement to contain the various rules.
Upvotes: 3
Reputation: 75336
I think you could do something like
ORDER BY REPLACE(TITLE, 'The ', '')
although this would replace any occurrence of 'The ' with '', not just the first 'The ', although I don't think this would affect very much.
Upvotes: 5