Reputation: 2083
I have a table games
with values such as:
+----------+------+
| game | year |
+----------+------+
| Football | 1999 |
| Football | 2000 |
| Football | 2001 |
| Football | 2002 |
| Cricket | 1996 |
| Tennis | 2001 |
| Tennis | 2002 |
| Tennis | 2003 |
| Tennis | 2009 |
| Golf | 1994 |
| Golf | 1996 |
| Golf | 1997 |
+----------+------+
I am trying to see if a game has an entry with a minimum three consecutive years in the table. My expected output is:
+----------+
| game |
+----------+
| Football |
| Tennis |
+----------+
Because:
1999, 2000, 2001, 2002
2001, 2002, 2003
In order to find the rows with a minimum three consecutive entries I first partitioned the table on game
and then checked difference between the current and the next row as below:
select game, year, case
when (year - lag(year) over (partition by game order by year)) is null then 1
else year - lag(year) over (partition by game order by year)
end as diff
from games
Output of the above query:
+----------+------+------+
| game | year | diff |
+----------+------+------+
| Football | 1999 | 1 |
| Football | 2000 | 1 |
| Football | 2001 | 1 |
| Football | 2002 | 1 |
| Cricket | 1996 | 1 |
| Tennis | 2001 | 1 |
| Tennis | 2002 | 1 |
| Tennis | 2003 | 1 |
| Tennis | 2009 | 6 |
| Golf | 1994 | 1 |
| Golf | 1996 | 2 |
| Golf | 1997 | 1 |
+----------+------+------+
I am not able to proceed from here on getting the output by filtering the data for each game with its difference.
Could anyone let me know if I am in the right track of the implementation? If not, how do I prepare the query to get the expected output?
Upvotes: 2
Views: 1043
Reputation: 198
Yes, your initial approach is correct. You were actually really close to fully figuring it out yourself.
What I would do is alter LAG
a bit:
year - LAG(year, 2) OVER (
PARTITION BY game
ORDER BY year
ROWS BETWEEN UNBOUNDED PRECEEDING AND CURRENT ROW
)
For each row, this will compare the difference between the year from current row and the year from (current - 2)th row.
If it is the third consecutive row it will yield 2
which you can filter in where
clause.
If your data contains duplicates you need to group by game, year
first.
Upvotes: 1
Reputation: 4806
You can use lag()
and lead()
and compare them to the current Year:
with u as
(select *, case
when lag(Year) over(partition by Game order by Year) = Year - 1
and lead(Year) over(partition by Game order by Year) = Year + 1
then 1 else 0
end as consec
from games)
select distinct Game
from u
where consec = 1;
Upvotes: 1
Reputation: 1071
Using CTE(Common Table Expression) and the useful ROW_NUMBER window function this can be easily solved.
WITH CTE (name, RN) AS (
select name, ROW_NUMBER() OVER (PARTITION BY name order by year) RN
from game)
Select Distinct name
from CTE
Where RN >= 3
Upvotes: 0
Reputation: 521289
You could use a self join approach here:
SELECT DISTINCT g1.Game
FROM games g1
INNER JOIN games g2
ON g2.Game = g1.Game AND g2.Year = g1.Year + 1
INNER JOIN games g3
ON g3.Game = g2.Game AND g3.Year = g2.Year + 1;
The above query requires any matching game to have at least one record whose year can be found in the following year, and the year after that as well.
Upvotes: 4