boruchsiper
boruchsiper

Reputation: 2028

order by string field with numbers

I'm selecting customer orders and i order them by customer_name.

The customer names are interesting, they go something like this

when I write the following query

select customer_name from orders order by customer_name

This is what I get

Notice that - Some customer #2 goes after - Some customer #11

I found this post which says to do the following:

ORDER BY 
CASE WHEN customer_name not like '%[0-9]%' THEN customer_name ELSE
   STUFF(customer_name, PATINDEX('%[0-9]%',customer_name), 0, replicate('0', 
   PATINDEX('%[0-9]%',customer_name) - len(customer_name) + PATINDEX('%[0-9]%',reverse(customer_name)) + 6))
END

Is there a better way to order by the numbers at the end?

Upvotes: 0

Views: 170

Answers (4)

Gordon Linoff
Gordon Linoff

Reputation: 1269503

If all the numbers start with '#', then you can do:

order by (case when customer_name not like '%#%' then customer_name
               else left(customer_name, charindex('#', customer_name))
          end),
         len(customer_name),
         customer_name

You could fiddle with trying to extract the number at the end of the string, if it exists, but given the formats in your question, this should work.

Upvotes: 2

Solomon Rutzky
Solomon Rutzky

Reputation: 48776

This is a two part answer :)

First: Getting the Sorting Right

The first thing to do is find a way to normalize the company_name values that have a "#". Since this is a string field it is going to sort numbers based on their position, not based on their value. So rather than split this value into multiple fields to put into an ORDER BY, let's make the string representation of the customer number something that is sortable. Meaning, while "2" naturally comes after "10" in terms of them being strings, "02" comes before "10", as expected. Take a look-see at the following:

SET NOCOUNT ON;
SET ANSI_NULLS ON;

-- DROP TABLE #Orders
CREATE TABLE #Orders
(
  OrderID INT IDENTITY(1, 1) NOT NULL,
  CustomerName NVARCHAR(50) NOT NULL
);

INSERT INTO #Orders (CustomerName) VALUES ('A Customer');
INSERT INTO #Orders (CustomerName) VALUES ('A Customer #1');
INSERT INTO #Orders (CustomerName) VALUES ('A Customer #2');
INSERT INTO #Orders (CustomerName) VALUES ('Some customer #1');
INSERT INTO #Orders (CustomerName) VALUES ('Some customer #2');
INSERT INTO #Orders (CustomerName) VALUES ('Some customer #10');
INSERT INTO #Orders (CustomerName) VALUES ('Some customer #11');


SELECT *,
       CASE WHEN CustomerName LIKE N'%#%' THEN
           STUFF(
                 [CustomerName],
                 (CHARINDEX(N'#', [CustomerName]) + 1), -- start just after the "#"
                 0, -- don't overwrite anything
                 REPLICATE(N'0',
                        (6 - (LEN([CustomerName]) - CHARINDEX(N'#', [CustomerName])))
                          ) -- ensure 6 digits
                )
            ELSE [CustomerName]
       END AS [SortName-test]
FROM #Orders
ORDER BY [SortName-test];

The results are as you desire.

Second: How / Where to Implement This

Sure, you could always just put this expression in the ORDER BY as follows:

SELECT *
FROM #Orders
ORDER BY
       CASE WHEN CustomerName LIKE N'%#%' THEN
           STUFF(
                 [CustomerName],
                 (CHARINDEX(N'#', [CustomerName]) + 1), -- start just after the "#"
                 0, -- don't overwrite anything
                 REPLICATE(N'0',
                        (6 - (LEN([CustomerName]) - CHARINDEX(N'#', [CustomerName])))
                          ) -- ensure 6 digits
                )
            ELSE [CustomerName]
       END;

But, depending on the size of the table and how often this field will be used in an ORDER BY (or even a GROUP BY or any other operation requiring a sort), doing an on-the-fly manipulation can be quite costly, and it can't be indexed.

In that case, it can be added as a PERSISTED COMPUTED COLUMN. Since it is "computed" it will change appropriately with any changes to the value in the base [company_name] field. And since it is "persisted", a) the value will be there and can even be used by the Query Optimizer, and b) it can be indexed for even more betterness.

Just run:

ALTER TABLE #Orders ADD [SortName] AS (CASE WHEN CustomerName LIKE N'%#%' THEN
           STUFF(
                 [CustomerName],
                 (CHARINDEX(N'#', [CustomerName]) + 1), -- start just after the "#"
                 0, -- don't overwrite anything
                 REPLICATE(N'0',
                        (6 - (LEN([CustomerName]) - CHARINDEX(N'#', [CustomerName]))))
                          ) -- ensure 6 digits
            ELSE [CustomerName]
       END) PERSISTED;

And that leaves you with the uber-complicated query of:

SELECT *
FROM #Orders
ORDER BY [SortName];

Upvotes: 2

Pரதீப்
Pரதீப்

Reputation: 93694

Try this. Simple way would be

ORDER  BY CASE
            WHEN customer_name NOT LIKE '%#%' THEN '0'
            WHEN customer_name LIKE '%#%' THEN LEFT(customer_name, Len(customer_name) - Charindex('#', customer_name))
          END,
          RIGHT(customer_name, Len(customer_name) - Charindex('#', customer_name)) 

Upvotes: 0

Giorgos Betsos
Giorgos Betsos

Reputation: 72165

You can break customer name in two pieces: Preceding string, trailing number and order by string, number.

See this:

SELECT *, LEFT(customer_name, LEN(customer_name) - PATINDEX('%[0-9][^0-9]%',  REVERSE(customer_name) )),
          RIGHT(customer_name, PATINDEX('%[0-9][^0-9]%',  REVERSE(customer_name) ))
FROM @Orders
ORDER BY LEFT(customer_name, LEN(customer_name) - PATINDEX('%[0-9][^0-9]%',  REVERSE(customer_name) )),
         CAST(RIGHT(customer_name, PATINDEX('%[0-9][^0-9]%',  REVERSE(customer_name) )) AS INT)

The above will work even if there is no '#' inside customer_name and even if there is another number inside customer_name, e.g. with Customer 12, Jonn 2 Smith #12 etc.

Upvotes: 0

Related Questions