Reputation: 2427
I am not very familiar with SQL and I hope some expert here can show me suitable and efficient query for what I want to achieve. I am using DB2 by the way.
Below is a screenshot of a sample data. What I need is for a given year, select the record with distinct ID1+ID2+Name columns and maximum (most recent) effective date (in YYYYMMDD format, stored as integer), with the above year being in between YearFrom and YearTo range.
FOr anyone that cant see a screenshot:
NAME YearFrom YearTo ID1 ID2 EffDate
item1 2002 2005 AB 10 20091201
item1 2009 2013 AB 10 20100301
item2 2001 2004 XX 20 20050103
item2 2002 2009 XX 20 20060710
item2 2007 2013 XX 20 20090912
item3 2005 2010 YY 30 20110304
I hope I explained it well. For example if user is looking for available items in year 2011, item1 (with eff. date 20100301) and item 2 (with eff. date 20090912) will be returned.
If someone is looking for items available in year 2008: item2 (with eff. date 20090912) and item 3 will be returned. Item 1 will not be returned in this case because the most recent record for item 1 has range of 2009-2013.
I think I have the first part of the query right, but I dont know how to select the valid records from that results based on the year in one query.
select name,id1,id2,max(effdate)
from [table]
group by name,id1,id2
Any help would be much appreciated.
Upvotes: 2
Views: 819
Reputation: 3342
you can go with below qyery for this type of output --
-- you want to check with the row where effective date is the maximum for the item name
column then you can take only those records and then we can put year condition on those records.
SELECT NAME, Id1, Id2, Effdate
FROM Table_Name t_1
WHERE Effdate =
(SELECT (t_2.Effdate)
FROM Table_Name t_2
WHERE t_2.NAME = t_1.NAME
and t_2.id1 = t_1.id1
and t_2.id2 = t_1.id2
GROUP BY t_2.name,t_2.id1,t_2.id2)
AND Your_Year_Variable_Value BETWEEN t_1.Yearfrom AND t_1.Yearto
Upvotes: 1
Reputation: 95582
It's not clear whether these two statements are in conflict. I think they are in conflict, and I'm going with statement 1 in the code below.
[1.] What I need is for a given year, select the record with distinct ID1+ID2+Name columns and maximum (most recent) effective date (in YYYYMMDD format, stored as integer), with the above year being in between YearFrom and YearTo range.
[2.] Item 1 will not be returned in this case because the most recent record for item 1 has range of 2009-2013.
I would say that item 1 would not be returned, because it has no information for year 2008. If it did have information for 2008, it should be returned per statement 1 above, regardless of whether there happened to be more recent data.
If you expand your table so each year appears in a row by itself, rather than being implied by a range like 2002-2005, it's pretty simple. The query below is in PostgreSQL; you should only have to replace the first common table expression with a DB2 equivalent to generate a table of numbers (or use an actual table of numbers), and fixup the CTE syntax. (DB2's CTE syntax is unique.)
with years as (
select generate_series(2000, 2020) as year
),
expanded_table1 as (
select id1, id2, name, year, yearfrom, yearto, effdate
from Table1
inner join years on years.year between YearFrom and YearTo
)
select id1, id2, name, year, max(effdate)
from expanded_table1
where year = 2008
group by id1, id2, name, year
Explanation
This query, the first CTE, generates a series of integers that represent all the years we might be interested in. A more robust solution might select the minimum and maximum years for the number generator from your table instead of using integer literals.
select generate_series(2000, 2020) as year;
YEAR
--
2000
2001
2002
...
2020
By joining that table with your table, we can expand the ranges into rows.
with years as (
select generate_series(2000, 2020) as year
)
select id1, id2, name, year, yearfrom, yearto, effdate
from Table1
inner join years on years.year between YearFrom and YearTo
order by id1, id2, name, year;
ID1 ID2 NAME YEAR YEARFROM YEARTO EFFDATE
--
AB 10 item1 2002 2002 2005 20091201
AB 10 item1 2003 2002 2005 20091201
AB 10 item1 2004 2002 2005 20091201
AB 10 item1 2005 2002 2005 20091201
...
Having prepared the foundation this way, the query to find the maximum effective date for each distinct combination of id1, id2, name, for a given year is just a simple GROUP BY with a WHERE clause.
with years as (
select generate_series(2000, 2020) as year
),
expanded_table1 as (
select id1, id2, name, year, yearfrom, yearto, effdate
from Table1
inner join years on years.year between YearFrom and YearTo
)
select id1, id2, name, year, max(effdate)
from expanded_table1
where year = 2011
group by id1, id2, name, year
ID1 ID2 NAME YEAR MAX
--
AB 10 item1 2011 20100301
XX 20 item2 2011 20090912
Upvotes: 1