Dilhan Nakandala
Dilhan Nakandala

Reputation: 363

SQL Unable to filter for a specific date with Group By

This might be something simple but I cannot get the desired output.

I am using this simple query below to get the latest exp_date of an item_no

select g.item_no,max(g.exp_date)
from grn_drug_item g
group by g.item_no

1st query only displays following result for 01-AUG-20

IT00002530 01-AUG-20

Now I want to filter above results for a 'specific' latest exp_date (ex: 01-AUG-20 as above) so I modified the query to:

select g.item_no,max(g.exp_date)
from grn_drug_item g
where g.exp_date = to_date('01-AUG-20','DD-MON-YY')
group by g.item_no

But this gives me multiple rows and if I use where g.item_no = 'IT00002530' it fetches the desired output.

How can I get the filtering working for a specific exp_date here?

Upvotes: 0

Views: 620

Answers (6)

MT0
MT0

Reputation: 167867

TL;DR - Filter on date after grouping using the HAVING clause and you may also need to filter on a 24-hour range rather than a single date (if your dates have a non-midnight time component that is not being displayed by the user-interface you are using).


Lets assume that you have sample data that is something like this:

CREATE TABLE grn_drug_item ( item_no, exp_date ) AS
SELECT 'IT00002530', DATE '2020-08-01' + INTERVAL '12:00:00' HOUR TO SECOND FROM DUAL UNION ALL
SELECT 'IT00002530', DATE '2020-07-01' FROM DUAL UNION ALL
SELECT 'ANOTHER001', DATE '2020-08-01' FROM DUAL UNION ALL
SELECT 'ANOTHER001', DATE '2020-08-05' FROM DUAL UNION ALL
SELECT 'OTHER00002', DATE '2020-07-01' FROM DUAL UNION ALL
SELECT 'OTHER00002', DATE '2020-08-04' FROM DUAL;

And your NLS_DATE_FORMAT session parameter is set to the default for America/China/UK of DD-MON-RR.

Then if you use your query:

SELECT g.item_no,
       MAX(g.exp_date)
FROM   grn_drug_item g
GROUP BY g.item_no

You will get the output:

ITEM_NO    | MAX(G.EXP_DATE)
:--------- | :--------------
ANOTHER001 | 05-AUG-20      
OTHER00002 | 04-AUG-20      
IT00002530 | 01-AUG-20      

Which matches your statement of:

1st query only displays following result for 01-AUG-20

 IT00002530 01-AUG-20

If we run the query:

SELECT g.item_no,
       MAX(g.exp_date)
FROM   grn_drug_item g
WHERE  g.exp_date = DATE '2020-08-01'
GROUP BY g.item_no

Then we are filtering the rows before grouping and will get the maximum from the filtered rows; which outputs:

ITEM_NO    | MAX(G.EXP_DATE)    
:--------- | :------------------
ANOTHER001 | 2020-08-01T00:00:00

(The NLS_DATE_FORMAT is now set to the ISO8601 format YYYY-MM-DD"T"HH24:MI:SS to display the time component as well)

This only shows the rows ITEM_NO that have a row where the EXP_DATE is exactly 2020-08-01T00:00:00. Since the sample data for IT00002530 has a time component of 12:00:00 it is not showing in the results.

To replicate your results I would need to filter on a 24 hour period and not a single instance at midnight:

SELECT g.item_no,max(g.exp_date)
FROM   grn_drug_item g
WHERE  g.exp_date >= DATE '2020-08-01'
AND    g.exp_date <  DATE '2020-08-01' + INTERVAL '1' DAY
GROUP BY g.item_no

Which would output:

ITEM_NO    | MAX(G.EXP_DATE)    
:--------- | :------------------
ANOTHER001 | 2020-08-01T00:00:00
IT00002530 | 2020-08-01T12:00:00

But you appear to want the rows where the maximum is on the day 01-AUG-20 so you would need to calculate the maximum and then filter on that; which you can do using a HAVING clause rather than a WHERE clause:

SELECT g.item_no,
       MAX(g.exp_date)
FROM   grn_drug_item g
GROUP BY g.item_no
HAVING MAX(g.exp_date) >= DATE '2020-08-01'
AND    MAX(g.exp_date) <  DATE '2020-08-01' + INTERVAL '1' DAY;

(Again, if your dates have a time component that is not always set to midnight then you want to filter on a 24 hour range rather than on a single instant.)

Which outputs:

ITEM_NO    | MAX(G.EXP_DATE)    
:--------- | :------------------
IT00002530 | 2020-08-01T12:00:00

db<>fiddle here

Upvotes: 1

Gordon Linoff
Gordon Linoff

Reputation: 1269563

This is a slight variation on the other answers. But you might find that filtering before the aggregation is a significant win on performance:

select g.item_no, max(g.exp_date)
from grn_drug_item g
where g.exp_date >= date '2020-08-01'
group by g.item_no
having max(g.exp_date) = date '2020-08-01';

This may significantly reduce the amount of data that needs to be aggregated. Also note the use of date to introduce the date constant.

Upvotes: 1

user1945782
user1945782

Reputation:

To add to @AlexPoole's answer, you may also want to consider turning your simple script into a stored procedure.

This will allow you to utilise variable dates in the call. For instance:

CREATE PROCEDURE getItemNumbersByMaxDate
    @maxDate AS DATE --  Assuming SQL can utilise DATE type
AS
    SELECT   g.item_no,max(g.exp_date)
    FROM     grn_drug_item g
    GROUP BY g.item_no
    HAVING   MAX(g.exp_date) >= @maxDate

Please note that I've written this for T-SQL, you may need to adjust it for Oracle.

Upvotes: 1

MTay
MTay

Reputation: 129

It's because of your use of the WHERE clause. Your query currently selects all items that has one instance of expiry on the 1st of August, but can have multiple expiry dates.

Using HAVING MAX(g.exp_date) = to_date('01-AUG-20','DD-MON-YY') should get you what you want, as it identifies the latest expiry date for each item. Your query should look like this:

select g.item_no, max(g.exp_date)
from grn_drug_item g
group by g.item_no
having max(g.exp_date) = to_date('01-AUG-20','DD-MON-YY')

Upvotes: 1

Conffusion
Conffusion

Reputation: 4465

The where clause is executed before the GROUP BY so you don't get items with a highest exp_date = 1/8/2020 but you get items with at least one exp_date = 1/8/2020.

To apply the filtering on the result after the group by, you need to use HAVING:

select g.item_no,max(g.exp_date) as max_exp_date
from grn_drug_item g
group by g.item_no
having max_exp_date = to_date('01-AUG-20','DD-MON-YY')

Upvotes: 1

Alex Poole
Alex Poole

Reputation: 191245

Without access to your data this is slightly speculative, but it looks like you're picking up items which have a later exp_date. Your filter is too early - those items with later dates and that specific date will be included in the filter, and the maximum date within that subset is obviously that same date.

You can apply a filter after grouping with having:

select g.item_no,max(g.exp_date)
from grn_drug_item g
group by g.item_no
having max(g.exp_date) = to_date('01-AUG-20','DD-MON-YY')

The use of 'YY' is slightly worrying; using two-digit year at all is a problem, but if you must, then use RR. It's better to use 4-digit years though, and for a fixed date it's simpler to use a date literal:

having max(g.exp_date) = date '2020-08-01'

Upvotes: 2

Related Questions