Reputation: 1491

Sorting twice on same column

I'm having a bit of a weird question, given to me by a client.

He has a list of data, with a date between parentheses like so:

Foo (14/08/2012)
Bar (15/08/2012)
Bar (16/09/2012)
Xyz (20/10/2012)

However, he wants the list to be displayed as follows:

Foo (14/08/2012)
Bar (16/09/2012)
Bar (15/08/2012)
Foot (20/10/2012)

(notice that the second Bar has moved up one position)

So, the logic behind it is, that the list has to be sorted by date ascending, EXCEPT when two rows have the same name ('Bar'). If they have the same name, it must be sorted with the LATEST date at the top, while staying in the other sorting order.

Is this even remotely possible? I've experimented with a lot of ORDER BY clauses, but couldn't find the right one. Does anyone have an idea?

I should have specified that this data comes from a table in a sql server database (the Name and the date are in two different columns). So I'm looking for a SQL-query that can do the sorting I want.

(I've dumbed this example down quite a bit, so if you need more context, don't hesitate to ask)

Upvotes: 4

Answers (5)

FabianoLothor

Reputation: 2985

Try something like...

ORDER BY CASE date
    WHEN '14/08/2012' THEN 1
    WHEN '16/09/2012' THEN 2
    WHEN '15/08/2012' THEN 3
    WHEN '20/10/2012' THEN 4
END

In MySQL, you can do:

ORDER BY FIELD(date, '14/08/2012', '16/09/2012', '15/08/2012', '20/10/2012')

In Postgres, you can create a function FIELD and do:

CREATE OR REPLACE FUNCTION field(anyelement, anyarray) RETURNS numeric AS $$
  SELECT
    COALESCE((SELECT i
              FROM generate_series(1, array_upper($2, 1)) gs(i)
              WHERE $2[i] = $1),
             0);
$$ LANGUAGE SQL STABLE

If you do not want to use the CASE, you can try to find an implementation of the FIELD function to SQL Server.

Upvotes: -2

Peter Lang

Reputation: 55624

This one uses analytic functions to perform the sort, it only requires one SELECT from your table.

The inner query finds gaps, where the name changes. These gaps are used to identify groups in the next query, and the outer query does the final sorting by these groups.

I have tried it here (SQL Fiddle) with extended test-data.

SELECT name, dat
FROM (
  SELECT name, dat, SUM(gap) over(ORDER BY dat, name) AS grp
  FROM (
    SELECT name, dat,
          CASE WHEN LAG(name) OVER (ORDER BY dat, name) = name THEN 0 ELSE 1 END AS gap
    FROM t
  ) x
) y
ORDER BY grp, dat DESC

Extended test-data

('Bar','2012-08-12'),
('Bar','2012-08-11'),
('Foo','2012-08-14'),
('Bar','2012-08-15'),
('Bar','2012-08-16'),
('Bar','2012-09-17'),
('Xyz','2012-10-20')

Result

Bar     2012-08-12
Bar     2012-08-11
Foo     2012-08-14
Bar     2012-09-17
Bar     2012-08-16
Bar     2012-08-15
Xyz     2012-10-20

Upvotes: 1

Seph

Reputation: 8703

A way with better performance than any of the other posted answers is to just do it entirely with an ORDER BY and not a JOIN or using CTE:

DECLARE @t TABLE (myData varchar(50), myDate datetime)

INSERT INTO @t VALUES 
('Foo','2012-08-14'),
('Bar','2012-08-15'),
('Bar','2012-09-16'), 
('Xyz','2012-10-20')

SELECT *
FROM @t t1
ORDER BY (SELECT MIN(t2.myDate) FROM @t t2 WHERE t2.myData = t1.myData), T1.myDate DESC

This does exactly what you request and will work with any indexes and much better with larger amounts of data than any of the other answers.

Additionally it's much more clear what you're actually trying to do here, rather than masking the real logic with the complexity of a join and checking the count of joined items.

Upvotes: 2

Damien_The_Unbeliever

Reputation: 239824

I think that this works, including the case I asked about in the comments:

declare @t table (data varchar(50), [date] datetime)

insert @t 
values
('Foo','20120814'),
('Bar','20120815'),
('Bar','20120916'), 
('Xyz','20121020')

; With OuterSort as (
    select *,ROW_NUMBER() OVER (ORDER BY [date] asc) as rn from @t
)
--Now we need to find contiguous ranges of the same data value, and the min and max row number for such a range
, Islands as (
    select data,rn as rnMin,rn as rnMax from OuterSort os where not exists (select * from OuterSort os2 where os2.data = os.data and os2.rn = os.rn - 1)
    union all
    select i.data,rnMin,os.rn
    from
        Islands i
            inner join
        OuterSort os
            on
                i.data = os.data and
                i.rnMax = os.rn-1
), FullIslands as (
    select
        data,rnMin,MAX(rnMax) as rnMax
    from Islands
    group by data,rnMin
)
select
    *
from
    OuterSort os
        inner join
    FullIslands fi
        on
            os.rn between fi.rnMin and fi.rnMax
order by
    fi.rnMin asc,os.rn desc

It works by first computing the initial ordering in the OuterSort CTE. Then, using two CTEs (Islands and FullIslands), we compute the parts of that ordering in which the same data value appears in adjacent rows. Having done that, we can compute the final ordering by any value that all adjacent values will have (such as the lowest row number of the "island" that they belong to), and then within an "island", we use the reverse of the originally computed sort order.

Note that this may, though, not be too efficient for large data sets. On the sample data it shows up as requiring 4 table scans of the base table, as well as a spool.

Upvotes: 0

podiluska

Reputation: 51514

This works, I think

declare @t table (data varchar(50), date datetime)

insert @t 
values
('Foo','2012-08-14'),
('Bar','2012-08-15'),
('Bar','2012-09-16'), 
('Xyz','2012-10-20')

select t.*
from @t t
    inner join (select data, COUNT(*) cg, MAX(date) as mg from @t group by data) tc
        on t.data = tc.data
order by case when cg>1 then mg else date end, date desc

produces

data       date
---------- -----------------------
Foo        2012-08-14 00:00:00.000
Bar        2012-09-16 00:00:00.000
Bar        2012-08-15 00:00:00.000
Xyz        2012-10-20 00:00:00.000

Upvotes: 4

Sorting twice on same column

Answers (5)

Related Questions