Reputation: 3
I currently have a very large query, and am trying to further sort the data in a specific way. The query returns data in the following way, with items sorted by TimeToComplete descending:
| id | ItemKey |TimeToComplete|
| -------- | -------------- |--------------|
| ABC | KEY-211-01 | 580 |
| DEF | KEY-311-01 | 456 |
| GHI | KEY-111-01 | 150 |
| JKL | KEY-411-01 | 87 |
| XYZ | KEY-311-01 | 23 |
| KNE | KEY-211-01 | 17 |
What I am trying to do is sort this data so like ItemKeys are grouped together, but otherwise still retains the TimeToComplete sorting based on the first item in the group. Example like so:
| id | ItemKey |TimeToComplete|
| -------- | -------------- |--------------|
| ABC | KEY-211-01 | 580 |
| KNE | KEY-211-01 | 17 |
| DEF | KEY-311-01 | 456 |
| XYZ | KEY-311-01 | 23 |
| GHI | KEY-111-01 | 150 |
| JKL | KEY-411-01 | 87 |
I do have a partially working example, but it has some issues:
WITH GroupedRecords AS (
SELECT
OriginalQuery.*,
ROW_NUMBER() OVER (partition by ItemKey order by TimeToComplete DESC) as RN
FROM (originally giant query here) OriginalQuery
),
Sequence AS (
SELECT
ItemKey,
TimeToComplete,
ROW_NUMBER() OVER (order by TimeToComplete DESC) as SequenceOrder
FROM GroupedRecords
WHERE RN = 1
)
select T.*, s.SequenceOrder
from GroupedRecords T
INNER JOIN Sequence S ON T.ItemKey = S.ItemKey
ORDER BY S.SequenceOrder ASC, T.TimeToComplete DESC
The problems with this are:
Question: Is there a more efficient way I can achieve this sorting without using the sequence/join parts?
Upvotes: 0
Views: 3516
Reputation: 4694
Just add the following window function to your select list and then order the results as follows:
SELECT id, ItemKey, TimeToComplete
, MAX(TimeToComplete) OVER (PARTITION BY ItemKey) AS max_time
FROM data
ORDER BY max_time DESC, ItemKey, TimeToComplete DESC
;
Result:
+------+------------+----------------+----------+
| id | ItemKey | TimeToComplete | max_time |
+------+------------+----------------+----------+
| ABC | KEY-211-01 | 580 | 580 |
| KNE | KEY-211-01 | 17 | 580 |
| DEF | KEY-311-01 | 456 | 456 |
| XYZ | KEY-311-01 | 23 | 456 |
| GHI | KEY-111-01 | 150 | 150 |
| JKL | KEY-411-01 | 87 | 87 |
+------+------------+----------------+----------+
Result, with data to test when two ItemKey
s have the same max_time
:
+------+------------+----------------+----------+
| id | ItemKey | TimeToComplete | max_time |
+------+------------+----------------+----------+
| ABC | KEY-211-01 | 580 | 580 |
| KNE | KEY-211-01 | 17 | 580 |
| ABD | KEY-211-02 | 580 | 580 |
| ABE | KEY-211-02 | 200 | 580 |
| DEF | KEY-311-01 | 456 | 456 |
| XYZ | KEY-311-01 | 23 | 456 |
| GHI | KEY-111-01 | 150 | 150 |
| JKL | KEY-411-01 | 87 | 87 |
+------+------------+----------------+----------+
You can generate the same result without the extra column by using a CTE term to add the new column, order by that in the outer query expression, but don't select that column in the outer query expression select list.
As follows:
WITH cte1 AS (
SELECT id, ItemKey, TimeToComplete
, MAX(TimeToComplete) OVER (PARTITION BY ItemKey) AS max_time
FROM data
)
SELECT id, ItemKey, TimeToComplete
FROM cte1
ORDER BY max_time DESC, ItemKey, TimeToComplete DESC
;
Result (updated with extra data):
+------+------------+----------------+
| id | ItemKey | TimeToComplete |
+------+------------+----------------+
| ABC | KEY-211-01 | 580 |
| KNE | KEY-211-01 | 17 |
| ABD | KEY-211-02 | 580 |
| ABE | KEY-211-02 | 200 |
| DEF | KEY-311-01 | 456 |
| XYZ | KEY-311-01 | 23 |
| GHI | KEY-111-01 | 150 |
| JKL | KEY-411-01 | 87 |
+------+------------+----------------+
Working test case - Updated to handle case raised by @mathguy
Upvotes: 0
Reputation:
Luckily, you only need to add an analytic max()
to the order by
clause. You don't need to do anything else.
Suppose "current query" is your existing query, not ordered yet in any way (no order by
clause). Add the following at the very end:
... existing query ...
order by max(timetocomplete) over (partition by itemkey) desc,
itemkey,
timetocomplete desc
;
Note that you do not need to add the analytic function to the select
clause. The SQL standard says you do; Oracle syntax says you don't. Oracle is taking care of the small additional steps for us, behind the scenes.
This computes the max time to complete for each key. It orders by that max first. In the case of ties (two or more different keys with the same max time to complete), it further orders by key first, and then within each key, by time to complete (descending).
Upvotes: 1