artemis
artemis

Reputation: 7281

MAX() OVER PARTITION BY in Oracle SQL

I am trying to utilize the MAX() OVER PARTITION BY function to evaluate the most recent receipt for a specific part that my company has bought. Below is an example table of the information for a few parts from the last year:

| VEND_NUM | VEND_NAME    | RECEIPT_NUM | RECEIPT_ITEM | RECEIPT_DATE |
|----------|--------------|-------------|----------|--------------|
| 100      | SmallTech    | 2001        | 5844HAJ  | 11/22/2017   |
| 100      | SmallTech    | 3188        | 5521LRO  | 12/31/2017   |
| 200      | RealSolution | 5109        | 8715JUI  | 05/01/2017   |
| 100      | SmallTech    | 3232        | 8715JUI  | 11/01/2017   |
| 200      | RealSolution | 2101        | 4715TEN  | 01/01/2017   |

As you can see, the third and fourth row show two different vendors for the SAME part number.

Here is my current query:

WITH

-- various other subqueries above...

    AllData AS
    (
        SELECT VEND_NUM, VEND_NAME, RECEIPT_NUM, RECEIPT_ITEM, RECEIPT_DATE
        FROM tblVend
            INNER JOIN tblReceipt ON VEND_NUM = RECEIPT_VEND_NUM
        WHERE
            VEND_NUM = '100' OR VEND_NUM = '200' AND RECEIPT_DATE >= '01-Jan-2017'
    ),

    SELECT MAX(RECEIPT_DATE) OVER PARTITION BY(RECEIPT_ITEM) AS "Recent Date", RECEIPT_ITEM
    FROM AllData

My return set looks like:

| Recent Date | RECEIPT_ITEM |
|-------------|--------------|
| 11/22/2017  | 5844HAJ      |
| 12/31/2017  | 5521LRO      |
| 11/01/2017  | 8715JUI      |
| 11/01/2017  | 8715JUI      |
| 01/01/2017  | 4715TEN      |

However, it should look like this:

| Recent Date | RECEIPT_ITEM |
|-------------|--------------|
| 11/22/2017  | 5844HAJ      |
| 12/31/2017  | 5521LRO      |
| 11/01/2017  | 8715JUI      |
| 01/01/2017  | 4715TEN      |

Can anybody please offer advice as to what I'm doing wrong? It looks like it is simply replacing the most recent date, not giving me just the row I want that is most recent.

Ultimately, I would like for my table to look like this. However, I don't know how to use the MAX() or MAX() OVER PARTITION BY() functions properly to allow for this:

| VEND_NUM | VEND_NAME    | RECEIPT_NUM | RECEIPT_ITEM | RECEIPT_DATE |
|----------|--------------|-------------|----------|--------------|
| 100      | SmallTech    | 2001        | 5844HAJ  | 11/22/2017   |
| 100      | SmallTech    | 3188        | 5521LRO  | 12/31/2017   |
| 100      | SmallTech    | 3232        | 8715JUI  | 11/01/2017   |
| 200      | RealSolution | 2101        | 4715TEN  | 01/01/2017   |

Upvotes: 6

Views: 109729

Answers (4)

Shinya
Shinya

Reputation: 11

Just passing by but I think you have to format the date to a 'YYYY-MM-DD' format so that it doesn't consider the "time".

Upvotes: 1

David Faber
David Faber

Reputation: 12495

I see a couple of issues here. One, the syntax for using the aggregate function MAX() as an analytic function (which is what Oracle helpfully calls a window function) looks like this:

MAX(receipt_date) OVER ( PARTITION BY receipt_item )

(note the position of the parentheses). Second, from your desired result set, you don't actually want a window function, you want to aggregate. A window (or analytic) function will always return a row for each row in its partition; that's just the way it works. So I think what you want is this:

WITH
-- various other subqueries above...
AllData AS
(
    SELECT VEND_NUM, VEND_NAME, RECEIPT_NUM, RECEIPT_ITEM, RECEIPT_DATE
      FROM tblVend
     INNER JOIN tblReceipt ON VEND_NUM = RECEIPT_VEND_NUM
     WHERE ( VEND_NUM = '100' OR VEND_NUM = '200' ) AND RECEIPT_DATE >= DATE'2017-01-01'
)
SELECT VEND_NUM, VEND_NAME, RECEIPT_NUM, RECEIPT_ITEM, MAX(RECEIPT_DATE)
  FROM AllData
 GROUP BY VEND_NUM, VEND_NAME, RECEIPT_NUM, RECEIPT_ITEM;

Now I made some small changes to the above, such as wrapping parentheses around the OR conditions (using IN ('100','200') might be even better) since AND takes precedence over OR (so your query would have gotten results where VEND_NUM = '100' OR ( VEND_NUM = '200' RECEIPT_DATE >= DATE'2017-01-01' ) ... but maybe that's what you wanted?).

Upvotes: 5

Gordon Linoff
Gordon Linoff

Reputation: 1271003

THIS ANSWERS THE ORIGINAL VERSION OF THE QUESTION.

Your where clause should probably look like this:

 WHERE VEND_NUM IN ('100', '200') AND RECEIPT_DATE >= DATE '2017-01-01'

It is quite possible that what you want is simply:

SELECT DISTINCT RECEIPT_DATE, RECEIPT_ITEM
FROM tblVend INNER JOIN
     tblReceipt
     ON VEND_NUM = RECEIPT_VEND_NUM
WHERE VEND_NUM IN ('100', '200') AND RECEIPT_DATE >= DATE '2017-01-01';

At the very least, this returns what you want to return.

Upvotes: 0

kc2018
kc2018

Reputation: 1460

Use window function ROW_NUMBER() OVER (PARTITION BY receipt_item ORDER BY receipt_date DESC) to assign a sequence number to each row. The row with the most recent receipt_date for a receipt_item will be numbered as 1.

WITH
-- various other subqueries above...

    AllData AS
    (
        SELECT VEND_NUM, VEND_NAME, RECEIPT_NUM, RECEIPT_ITEM, RECEIPT_DATE,
        ROW_NUMBER() OVER (PARTITION BY RECEIPT_ITEM ORDER BY RECEIPT_DATE DESC ) AS RN
        FROM tblVend
            INNER JOIN tblReceipt ON VEND_NUM = RECEIPT_VEND_NUM
        WHERE
            VEND_NUM IN ( '100','200')  AND RECEIPT_DATE >= '01-Jan-2017'
    )
   SELECT VEND_NUM, VEND_NAME, RECEIPT_NUM, RECEIPT_ITEM, RECEIPT_DATE
   FROM AllData WHERE RN = 1

Upvotes: 7

Related Questions