Reputation: 1
I am trying to use an SQL-query to find the predecessor municipality of a municipality or, if available, other predecessor municipalities.
Here is my input table (MUNICIPALMERGE):
municipality_id | valid_year_from | valid_year_to | predecessor_municipality_id |
---|---|---|---|
1000 | 1990 | 1995 | NULL |
1001 | 1996 | 2000 | 1000 |
1002 | 2001 | 2005 | 1001 |
1003 | 1990 | 2005 | NULL |
My goal is a table showing which municipality was valid in the corresponding year. For the example above, the correct table would be as follows:
year | municipality_id | valid_municipality_id |
---|---|---|
1990 | 1000 | 1000 |
1991 | 1000 | 1000 |
1992 | 1000 | 1000 |
1993 | 1000 | 1000 |
1994 | 1000 | 1000 |
1995 | 1000 | 1000 |
1996 | 1000 | 1001 |
1997 | 1000 | 1001 |
1998 | 1000 | 1001 |
1999 | 1000 | 1001 |
2000 | 1000 | 1001 |
2001 | 1000 | 1002 |
2002 | 1000 | 1002 |
2003 | 1000 | 1002 |
2004 | 1000 | 1002 |
2005 | 1000 | 1002 |
1990 | 1001 | 1000 |
1991 | 1001 | 1000 |
1992 | 1001 | 1000 |
1993 | 1001 | 1000 |
1994 | 1001 | 1000 |
1995 | 1001 | 1000 |
1996 | 1001 | 1001 |
1997 | 1001 | 1001 |
1998 | 1001 | 1001 |
1999 | 1001 | 1001 |
2000 | 1001 | 1001 |
2001 | 1001 | 1002 |
2002 | 1001 | 1002 |
2003 | 1001 | 1002 |
2004 | 1001 | 1002 |
2005 | 1001 | 1002 |
1990 | 1002 | 1000 |
1991 | 1002 | 1000 |
1992 | 1002 | 1000 |
1993 | 1002 | 1000 |
1994 | 1002 | 1000 |
1995 | 1002 | 1000 |
1996 | 1002 | 1001 |
1997 | 1002 | 1001 |
1998 | 1002 | 1001 |
1999 | 1002 | 1001 |
2000 | 1002 | 1001 |
2001 | 1002 | 1002 |
2002 | 1002 | 1002 |
2003 | 1002 | 1002 |
2004 | 1002 | 1002 |
2005 | 1002 | 1002 |
1990 | 1003 | 1003 |
1991 | 1003 | 1003 |
1992 | 1003 | 1003 |
1993 | 1003 | 1003 |
1994 | 1003 | 1003 |
1995 | 1003 | 1003 |
1996 | 1003 | 1003 |
1997 | 1003 | 1003 |
1998 | 1003 | 1003 |
1999 | 1003 | 1003 |
2000 | 1003 | 1003 |
2001 | 1003 | 1003 |
2002 | 1003 | 1003 |
2003 | 1003 | 1003 |
2004 | 1003 | 1003 |
2005 | 1003 | 1003 |
This means that the respective predecessor or successor municipalities, if any, should be found for each municipality.
Is this somehow possible and if so, how?
Thank you very much for your help.
I tried the Query below, but got some null values in the Valid_minicipality_id column.
WITH
YEARS AS
(
SELECT
1990 AS YEAR
UNION ALL
SELECT
YEAR+1
FROM
YEARS
WHERE
YEAR+1<=2005--YEAR(GETDATE())
)
--SELECT * FROM YEARS;
, YEARS_MUNICIPALITY AS
(
SELECT
Y.YEAR
, MM.MUNICIPALITY_ID
, ISNULL(MM.PREDECESSOR_MUNICIPALITY_ID, MM.MUNICIPALITY_ID) AS PREDECESSOR_MUNICIPALITY_ID
FROM
YEARS Y
INNER JOIN
MUNICIPALMERGE MM
ON
Y.YEAR BETWEEN MM.VALID_YEAR_FROM AND MM.VALID_YEAR_TO
)
--SELECT * FROM YEARS_MUNICIPALITY;
, ALL_YEARS_MUNICIPALITY AS
(
SELECT
Y.YEAR
, MM.MUNICIPALITY_ID
, MM.PREDECESSOR_MUNICIPALITY_ID
FROM
YEARS Y
CROSS JOIN
MUNICIPALMERGE MM
)
--SELECT * FROM ALL_YEARS_MUNICIPALITY;
SELECT
AYM.YEAR
, AYM.MUNICIPALITY_ID
, YM.MUNICIPALITY_ID AS VALID_MUNICIPALITY_ID
FROM
ALL_YEARS_MUNICIPALITY AYM
LEFT JOIN
YEARS_MUNICIPALITY YM
ON
AYM.YEAR=YM.YEAR
AND AYM.MUNICIPALITY_ID=YM.PREDECESSOR_MUNICIPALITY_ID
ORDER BY
AYM.MUNICIPALITY_ID
, AYM.YEAR;
Upvotes: 0
Views: 79
Reputation: 5916
I propose to consider an example where we first build a complete table of predecessors and followers by going down and up the tree.
The query covers all the years for each municipality_id for which there is data in the table. We don't extract these years specifically.
To avoid repetitions, we use the accumulated path.
The result of traversing the tree at the end of the text.
To extend the result for each year, you can use generate_series
or a recursive query
.
Both options seem quite acceptable to me.
See examples:
Traversing tree and expand with generate_series
with r as(
select municipality_id, municipality_id valid_municipality_id
,valid_year_from,valid_year_to
,predecessor_municipality_id
,cast(concat('-',municipality_id,'-') as varchar(1000)) path
from municipalmerge
union all
select r.municipality_id, t.municipality_id valid_municipality_id
,t.valid_year_from,t.valid_year_to
,t.predecessor_municipality_id
,cast(concat(path,t.municipality_id,'-') as varchar(1000)) path
from r inner join municipalmerge t
on t.municipality_id=r.predecessor_municipality_id
or t.predecessor_municipality_id=r.valid_municipality_id
where charindex(concat('-',t.municipality_id,'-'),path)=0
)
select value as year,municipality_id,valid_municipality_id
from r
cross apply generate_series(r.valid_year_from,r.valid_year_to) t
order by municipality_id,value
Example with 2 recursive queries.
with r as(
select municipality_id, municipality_id valid_municipality_id
,valid_year_from,valid_year_to
,predecessor_municipality_id
,cast(concat('-',municipality_id,'-') as varchar(1000)) path
from municipalmerge
union all
select r.municipality_id, t.municipality_id valid_municipality_id
,t.valid_year_from,t.valid_year_to
,t.predecessor_municipality_id
,cast(concat(path,t.municipality_id,'-') as varchar(1000)) path
from r inner join municipalmerge t
on t.municipality_id=r.predecessor_municipality_id
or t.predecessor_municipality_id=r.valid_municipality_id
where charindex(concat('-',t.municipality_id,'-'),path)=0
)
,allYears as(
select municipality_id,valid_municipality_id
,valid_year_from year
,valid_year_to
from r
union all
select municipality_id,valid_municipality_id
,year+1 year
,valid_year_to
from allYears
where (year+1)<=valid_year_to
)
select year,municipality_id,valid_municipality_id
from allyears
order by municipality_id,year
Details:
First recursion, before expanding for every year. Recursion depth is equal max number of merges for municipality_id.
with r as(
-- in anchor we take municipality_id and use id as root thru recursion
select municipality_id, municipality_id valid_municipality_id
,valid_year_from,valid_year_to
,predecessor_municipality_id
,cast(concat('-',municipality_id,'-') as varchar(1000)) path
,1 lvl -- for debug only
from municipalmerge
union all
select r.municipality_id, t.municipality_id valid_municipality_id
,t.valid_year_from,t.valid_year_to
,t.predecessor_municipality_id
-- accumulate path
,cast(concat(path,t.municipality_id,'-') as varchar(1000)) path
,lvl+1 lvl
from r inner join municipalmerge t -- join up and down
on t.municipality_id=r.predecessor_municipality_id
or t.predecessor_municipality_id=r.valid_municipality_id
-- exclude traversed points
where charindex(concat('-',t.municipality_id,'-'),path)=0
)
select * from r order by municipality_id,valid_year_from
municipality_id | valid_municipality_id | valid_year_from | valid_year_to | predecessor_municipality_id | path | lvl |
---|---|---|---|---|---|---|
1000 | 1000 | 1990 | 1995 | null | -1000- | 1 |
1000 | 1001 | 1996 | 2000 | 1000 | -1000-1001- | 2 |
1000 | 1002 | 2001 | 2005 | 1001 | -1000-1001-1002- | 3 |
1001 | 1000 | 1990 | 1995 | null | -1001-1000- | 2 |
1001 | 1001 | 1996 | 2000 | 1000 | -1001- | 1 |
1001 | 1002 | 2001 | 2005 | 1001 | -1001-1002- | 2 |
1002 | 1000 | 1990 | 1995 | null | -1002-1001-1000- | 3 |
1002 | 1001 | 1996 | 2000 | 1000 | -1002-1001- | 2 |
1002 | 1002 | 2001 | 2005 | 1001 | -1002- | 1 |
1003 | 1003 | 1990 | 2005 | null | -1003- | 1 |
Expanded output
year | municipality_id | valid_municipality_id |
---|---|---|
1990 | 1000 | 1000 |
1991 | 1000 | 1000 |
1992 | 1000 | 1000 |
1993 | 1000 | 1000 |
1994 | 1000 | 1000 |
1995 | 1000 | 1000 |
1996 | 1000 | 1001 |
1997 | 1000 | 1001 |
1998 | 1000 | 1001 |
1999 | 1000 | 1001 |
2000 | 1000 | 1001 |
2001 | 1000 | 1002 |
2002 | 1000 | 1002 |
2003 | 1000 | 1002 |
2004 | 1000 | 1002 |
2005 | 1000 | 1002 |
1990 | 1001 | 1000 |
1991 | 1001 | 1000 |
1992 | 1001 | 1000 |
1993 | 1001 | 1000 |
1994 | 1001 | 1000 |
1995 | 1001 | 1000 |
1996 | 1001 | 1001 |
1997 | 1001 | 1001 |
1998 | 1001 | 1001 |
1999 | 1001 | 1001 |
2000 | 1001 | 1001 |
2001 | 1001 | 1002 |
2002 | 1001 | 1002 |
2003 | 1001 | 1002 |
2004 | 1001 | 1002 |
2005 | 1001 | 1002 |
1990 | 1002 | 1000 |
1991 | 1002 | 1000 |
1992 | 1002 | 1000 |
1993 | 1002 | 1000 |
1994 | 1002 | 1000 |
1995 | 1002 | 1000 |
1996 | 1002 | 1001 |
1997 | 1002 | 1001 |
1998 | 1002 | 1001 |
1999 | 1002 | 1001 |
2000 | 1002 | 1001 |
2001 | 1002 | 1002 |
2002 | 1002 | 1002 |
2003 | 1002 | 1002 |
2004 | 1002 | 1002 |
2005 | 1002 | 1002 |
1990 | 1003 | 1003 |
1991 | 1003 | 1003 |
1992 | 1003 | 1003 |
1993 | 1003 | 1003 |
1994 | 1003 | 1003 |
1995 | 1003 | 1003 |
1996 | 1003 | 1003 |
1997 | 1003 | 1003 |
1998 | 1003 | 1003 |
1999 | 1003 | 1003 |
2000 | 1003 | 1003 |
2001 | 1003 | 1003 |
2002 | 1003 | 1003 |
2003 | 1003 | 1003 |
2004 | 1003 | 1003 |
2005 | 1003 | 1003 |
Upvotes: 0
Reputation: 95924
I'm not really sure this is the most performant way of doing it, but it did get there.
I use GENERATE_SERIES
to get all the calendar years, not an rCTE. If you aren't on SQL Server 2022+, then use a tally. For a small amount of data, an rCTE is "fine" for generating the rows, but it gets slow fast so I avoid them for generating such data in all scenarios and use a set-based method.
I then do use an rCTE to transverse the hierarchy of your data. I then get use a further CTE put the values on both sides (so you have 1000,1001 and 1001,1000) and get the DISTINCT
results.
Finally, I then JOIN
that to your original data, based on the value of the "valid" municipality. This results in the following:
CREATE TABLE dbo.YourTable (municipality_id int,
valid_year_from int,
valid_year_to int,
predecessor_municipality_id int);
GO
INSERT INTO dbo.YourTable (municipality_id,
valid_year_from,
valid_year_to,
predecessor_municipality_id)
VALUES (1000, 1990, 1995, NULL),
(1001, 1996, 2000, 1000),
(1002, 2001, 2005, 1001),
(1003, 1990, 2005, NULL);
GO
WITH MinMaxYears AS(
SELECT MIN(valid_year_from) AS MinYear,
MAX(valid_year_to) AS MaxYear
FROM dbo.YourTable),
Years AS(
SELECT GS.value AS CalendarYear
FROM MinMaxYears MMY
CROSS APPLY GENERATE_SERIES(MMY.MinYear, MMY.MaxYear, 1) GS),
Municipalities AS(
SELECT YT.municipality_id,
YT.municipality_id AS Current_municipality_id,
YT.predecessor_municipality_id
FROM dbo.YourTable YT
UNION ALL
SELECT M.municipality_id,
YT.municipality_id AS Current_municipality_id,
YT.predecessor_municipality_id
FROM dbo.YourTable YT
JOIN Municipalities M ON YT.municipality_id = M.predecessor_municipality_id),
ValidMunicipalities AS(
SELECT DISTINCT V.municipality_id,
V.valid_municipality_id
FROM Municipalities M
CROSS APPLY (VALUES(M.municipality_id, M.Current_municipality_id),
(M.Current_municipality_id, M.municipality_id))V(municipality_id,valid_municipality_id))
SELECT Y.CalendarYear,
VM.municipality_id,
VM.valid_municipality_id
FROM ValidMunicipalities VM
CROSS JOIN Years Y
JOIN dbo.YourTable YT ON Y.CalendarYear BETWEEN YT.valid_year_from AND YT.valid_year_to
AND VM.valid_municipality_id = YT.municipality_id
ORDER BY VM.municipality_id,
VM.valid_municipality_id,
Y.CalendarYear;
GO
DROP TABLE dbo.YourTable;
Upvotes: 0