Chris Monteiro
Chris Monteiro

Reputation: 3

Oracle SQL: Further sort PARTITION BY groups based on first row in each partition

I currently have a very large query, and am trying to further sort the data in a specific way. The query returns data in the following way, with items sorted by TimeToComplete descending:

|   id     |    ItemKey     |TimeToComplete|
| -------- | -------------- |--------------|
| ABC      | KEY-211-01     |    580       |
| DEF      | KEY-311-01     |    456       |
| GHI      | KEY-111-01     |    150       |
| JKL      | KEY-411-01     |     87       |
| XYZ      | KEY-311-01     |     23       |
| KNE      | KEY-211-01     |     17       |

What I am trying to do is sort this data so like ItemKeys are grouped together, but otherwise still retains the TimeToComplete sorting based on the first item in the group. Example like so:

|   id     |    ItemKey     |TimeToComplete|
| -------- | -------------- |--------------|
| ABC      | KEY-211-01     |    580       |
| KNE      | KEY-211-01     |     17       |
| DEF      | KEY-311-01     |    456       |
| XYZ      | KEY-311-01     |     23       |
| GHI      | KEY-111-01     |    150       |
| JKL      | KEY-411-01     |     87       |

I do have a partially working example, but it has some issues:

WITH GroupedRecords AS (
  SELECT 
          OriginalQuery.*,
          ROW_NUMBER() OVER (partition by ItemKey order by TimeToComplete DESC) as RN
  FROM (originally giant query here) OriginalQuery  
),
Sequence AS (
  SELECT 
         ItemKey,
         TimeToComplete,
         ROW_NUMBER() OVER (order by TimeToComplete DESC) as SequenceOrder
  FROM GroupedRecords
  WHERE RN = 1
)
select T.*, s.SequenceOrder
from GroupedRecords T
INNER JOIN Sequence S ON T.ItemKey = S.ItemKey
ORDER BY S.SequenceOrder ASC, T.TimeToComplete DESC

The problems with this are:

  1. The inner join between GroupedRecords and the Sequence changes a bunch of my column names (but not all of them) to Oracle's randomly generated names (QCSJ_0000006)
  2. The join is also makes the query way too slow (OriginalQuery is already not very optimized, but this is doubling its execution time)

Question: Is there a more efficient way I can achieve this sorting without using the sequence/join parts?

Upvotes: 0

Views: 3516

Answers (2)

Jon Armstrong
Jon Armstrong

Reputation: 4694

Just add the following window function to your select list and then order the results as follows:

SELECT id, ItemKey, TimeToComplete
     , MAX(TimeToComplete) OVER (PARTITION BY ItemKey) AS max_time
  FROM data
 ORDER BY max_time DESC, ItemKey, TimeToComplete DESC
;

Result:

+------+------------+----------------+----------+
| id   | ItemKey    | TimeToComplete | max_time |
+------+------------+----------------+----------+
| ABC  | KEY-211-01 |            580 |      580 |
| KNE  | KEY-211-01 |             17 |      580 |
| DEF  | KEY-311-01 |            456 |      456 |
| XYZ  | KEY-311-01 |             23 |      456 |
| GHI  | KEY-111-01 |            150 |      150 |
| JKL  | KEY-411-01 |             87 |       87 |
+------+------------+----------------+----------+

Result, with data to test when two ItemKeys have the same max_time:

+------+------------+----------------+----------+
| id   | ItemKey    | TimeToComplete | max_time |
+------+------------+----------------+----------+
| ABC  | KEY-211-01 |            580 |      580 |
| KNE  | KEY-211-01 |             17 |      580 |
| ABD  | KEY-211-02 |            580 |      580 |
| ABE  | KEY-211-02 |            200 |      580 |
| DEF  | KEY-311-01 |            456 |      456 |
| XYZ  | KEY-311-01 |             23 |      456 |
| GHI  | KEY-111-01 |            150 |      150 |
| JKL  | KEY-411-01 |             87 |       87 |
+------+------------+----------------+----------+

You can generate the same result without the extra column by using a CTE term to add the new column, order by that in the outer query expression, but don't select that column in the outer query expression select list.

As follows:

WITH cte1 AS (
        SELECT id, ItemKey, TimeToComplete
             , MAX(TimeToComplete) OVER (PARTITION BY ItemKey) AS max_time
          FROM data
     )
SELECT id, ItemKey, TimeToComplete
  FROM cte1
 ORDER BY max_time DESC, ItemKey, TimeToComplete DESC
;

Result (updated with extra data):

+------+------------+----------------+
| id   | ItemKey    | TimeToComplete |
+------+------------+----------------+
| ABC  | KEY-211-01 |            580 |
| KNE  | KEY-211-01 |             17 |
| ABD  | KEY-211-02 |            580 |
| ABE  | KEY-211-02 |            200 |
| DEF  | KEY-311-01 |            456 |
| XYZ  | KEY-311-01 |             23 |
| GHI  | KEY-111-01 |            150 |
| JKL  | KEY-411-01 |             87 |
+------+------------+----------------+

Working test case - Updated to handle case raised by @mathguy

Upvotes: 0

user5683823
user5683823

Reputation:

Luckily, you only need to add an analytic max() to the order by clause. You don't need to do anything else.

Suppose "current query" is your existing query, not ordered yet in any way (no order by clause). Add the following at the very end:

... existing query ...
order  by max(timetocomplete) over (partition by itemkey) desc,
          itemkey,
          timetocomplete desc
;

Note that you do not need to add the analytic function to the select clause. The SQL standard says you do; Oracle syntax says you don't. Oracle is taking care of the small additional steps for us, behind the scenes.

This computes the max time to complete for each key. It orders by that max first. In the case of ties (two or more different keys with the same max time to complete), it further orders by key first, and then within each key, by time to complete (descending).

Upvotes: 1

Related Questions