bob
bob

Reputation: 47

Get products from a specific year

I Want to select a site whose product end date is only 2018.

There are sites which have multiple end dates for products. For Example, For Site A, for Product P1, there will multiple End dates, 2018, 2019, 2020. I want a Site which has a product end date 2018

┌──────┬───────┬─────────┐
│ site │product│ EndDate │
├──────┼───────┼─────────┤
│   A  │   P1  │   2018  │
│   A  │   P2  │   2018  │
│   A  │   P1  │   2019  │
│   B  │   P2  │   2018  │
│   B  │   P1  │   2018  │
│   B  │   P2  │   2019  │
│   A  │   P1  │   2018  │
│   C  │   P2  │   2018  │
│   C  │   P1  │   2019  │
│   C  │   P2  │   2020  │
└──────┴───────┴─────────┘

According to sample data, the result I need is:

2nd row, A  P2  2018
5th row, B  P1  2018

The query I wrote gave nothing, I know it's wrong, but in case If you want to take a look:

select * from utk1
where EndDate not in (Select * from utk1 where EndDate = '2018') 
AND   EndDate not in (Select * from utk1 where EndDate = '2019')

Upvotes: 2

Views: 1142

Answers (6)

iSR5
iSR5

Reputation: 3498

Your question is not clear enough, as you asked for products with EndDate in 2018 and your EndDate column in the sample has multiple 2018 entries, while in the expecting results you only selected two entries without mentioning the criteria that needed to obtain the expected output (or the records pattern). In other word, you're giving the community a puzzle to be solved (or that how I see it).

Anyhow, from the given output, I guess you need to get the sites products that are listed within the year 2018.

So, (If my analytics skills is still in shape), from the given output, the second and 5th rows have been selected because of the their previous rows were in the same year (2018).

IF that's the case, you might need consider LAG() or LEAD() functions, they will be useful in your case. You can use these functions to add some starting and ending points which will make it easier to filter further.

In the case of your sample, I've used ROW_NUMBER() function (for the sake of simplicity). The output might vary on the actual data, so you need to check and assure you've got the right output, if you see some incorrect data, then LAG() or LEAD() will be your solution.

Here is the solution with ROW_NUMBER()

;WITH CTE AS(
SELECT 
    *,
    ROW_NUMBER() OVER(PARTITION BY [site] ORDER BY [site]) AS RN
FROM YourTable
) 
    SELECT
        [site], 
        product, 
        YEAR(EndDate)
    FROM CTE 
    WHERE 
        YEAR(EndDate) = 2018
    AND RN = 2
    ORDER BY [site]

Here is the solution with LAG() (To get the previous record)

;WITH CTE AS(
SELECT 
    [site], 
    product,
    EndDate AS StartDate, 
    LAG(EndDate) OVER(PARTITION BY [site] ORDER BY [site]) AS EndDate
FROM YourTable
) 
    SELECT
        [site], 
        product, 
        YEAR(EndDate)
    FROM CTE 
    WHERE 
        YEAR(StartDate)  = 2018
    AND YEAR(EndDate)    = 2018
    ORDER BY [site]

Upvotes: 1

Gordon Linoff
Gordon Linoff

Reputation: 1270463

Your approach is okay. You just need the correct in logic:

select *
from utk1
where site in (Select site from utk1 where EndDate = '2018')  and
      site not in (Select site from utk1 where EndDate >= '2019');

Note: I don't really believe that you are storing years as strings. So, you should dispense with the single quotes. Mixing types is a bad practice. (If the values really are strings, add the single quotes below.)

If you want this at the product level, then use exists:

select u.*
from utk1
where not exists (select 1
                  from utk1 u2 
                  where u2.site = u.site and u2.product = u.product and
                        u2.enddate >= 2019
                 ) and
      exists (select 1
              from utk1 u2 
              where u2.site = u.site and u2.product = u.product and
                    u2.enddate = 2018
             ) ;

However, I think a better approach is aggregation:

select site, product
from utk1
group by site, product
having max(enddate) = 2018;

This gets you the site/product pairs which end in 2018. Based on the comments, that seems like the most reasonable interpretation of the question.

Upvotes: 1

Man
Man

Reputation: 772

select * from utk1
group by site,product
having max(EndDate) ='2018';

Upvotes: -1

Racil Hilan
Racil Hilan

Reputation: 25351

If you want products that fall in 2018 only and no other years, you need to add a condition of count to be 1:

SELECT site, product, MAX(EndDate) AS EndDate
FROM utk1
GROUP BY site, product
HAVING MAX(EndDate) = '2018'
   AND COUNT(DISTINCT EndDate) = 1;

Note: from your question, it seems that EndDate is a varchar. If that's not correct and it is a number, then change = '2018' to = 2018 (remove the quotes). Also this answer assumes that there could be years prior to 2018. If that's not the case, the last line is not required and can be removed.

Upvotes: 0

Krunal Soni
Krunal Soni

Reputation: 151

You should try this one

Select * from (
SELECT site, product, MAX(enddate) as enddate
FROM utk1 
GROUP BY site, product
)a where enddate = '2018'

Upvotes: 0

pfx
pfx

Reputation: 23294

Can you try the following

WITH cte (site, product, enddate) 
AS (
    SELECT site, product, MAX(enddate)
    FROM utk1
    GROUP BY site, product
)
SELECT site, product, enddate
FROM cte
WHERE enddate = 2018

EDIT

A set/cte is being created with for each site and product combination its maximum enddate. Afterwards only the ones having enddate 2018 are being considered.
Eg. for site B product P1 the max/highest enddate is 2019, this results in this row not being considered. For site B P2 the max enddate is 2018, so this row is part of the endresult.

Upvotes: 0

Related Questions