Reputation: 1072
I have a table created by:
CREATE TABLE #test_table
(
id INT
,EventName VARCHAR(50)
,HomeTeam VARCHAR(25)
,Metric INT
)
INSERT INTO #test_table VALUES
(1, 'Team A vs Team B', 'Team A', 5),
(2, 'Team A vs Team B', 'Team A', 7),
(3, 'Team C vs Team D', 'Team C', 6),
(4, 'Team Z vs Team A', 'Team Z', 8),
(5, 'Team A vs Team B', 'Team A', 9),
(6, 'Team C vs Team D', 'Team C', 3),
(7, 'Team C vs Team D', 'Team C', 1),
(8, 'Team E vs Team F', 'Team E', 2)
Which results in:
id EventName HomeTeam Metric
------------------------------------------
1 Team A vs Team B Team A 5
2 Team A vs Team B Team A 7
3 Team C vs Team D Team C 6
4 Team Z vs Team A Team Z 8
5 Team A vs Team B Team A 9
6 Team C vs Team D Team C 3
7 Team C vs Team D Team C 1
8 Team E vs Team F Team E 2
A want to calculate a new column PreviousMetricN
where N can be 1, 2, 3, ... which shows the previous value for Metric
, but only if the HomeTeam
was involved in the previous event. For example:
id EventName HomeTeam Metric PreviousMetric1 PreviousMetric2
------------------------------------------------------------------------
1 Team A vs Team B Team A 5 NULL NULL
2 Team A vs Team B Team A 7 5 NULL
3 Team C vs Team D Team C 6 NULL NULL
4 Team Z vs Team A Team Z 8 NULL NULL
5 Team A vs Team B Team A 9 8 7
6 Team C vs Team D Team C 3 6 NULL
7 Team C vs Team D Team C 1 3 6
8 Team E vs Team F Team E 2 NULL NULL
I have been trying variations of LAG
with a new grouping variable in the PARTITION BY
clause such as
LAG(Metric) OVER(Partition by (CASE WHEN CHARINDEX(HomeTeam, EventName)>0 THEN 1 ELSE 0 END) ORDER BY id)
but without any success. How can this be done?
EDIT: I've also asked this question for Pandas here: Pandas shift - get previous value if multiple conditions satisfied
Upvotes: 4
Views: 2939
Reputation: 71805
I see no answer here that uses window functions and a single scan of the table. We can do this query in a single scan as follows:
Let us assume you have the AwayTeam
in another column.
If you don't have this yet and you wanted to parse it out of
EventData
:
We could use:SUBSTRING(EventData, CHARINDEX(' vs ', EventData) + 4)
I urge you to follow proper normalization and create this as a proper column in your table.
Our algorithm runs like this:
CROSS APPLY
Metric
s using LAG
, partitioning by the merged Team
columnSELECT id, HomeTeam, AwayTeam, Metric, Prev1, Prev2, Prev3
FROM (
SELECT *
,Prev1 = LAG(Metric, 1) OVER (PARTITION BY v.Team ORDER BY id)
,Prev2 = LAG(Metric, 2) OVER (PARTITION BY v.Team ORDER BY id)
,Prev3 = LAG(Metric, 3) OVER (PARTITION BY v.Team ORDER BY id)
-- more of these ......
FROM test_table
CROSS APPLY (VALUES (HomeTeam, 1),(AwayTeam, 0)) AS v(Team,IsHome)
) AS t
WHERE IsHome = 1
-- ORDER BY id --if necessary
Importantly, we can do this without the use of multiple different sorts, partitions or ordering, and without the use of a self-join. Just a single scan.
Result:
id | HomeTeam | AwayTeam | Metric | Prev1 | Prev2 | Prev3 |
---|---|---|---|---|---|---|
1 | Team A | Team B | 5 | (null) | (null) | (null) |
2 | Team A | Team B | 7 | 5 | (null) | (null) |
3 | Team C | Team D | 6 | (null) | (null) | (null) |
4 | Team Z | Team A | 8 | (null) | (null) | (null) |
5 | Team A | Team B | 9 | 8 | 7 | 5 |
6 | Team C | Team D | 3 | 6 | (null) | (null) |
7 | Team C | Team D | 1 | 3 | 6 | (null) |
8 | Team E | Team F | 2 | (null) | (null) | (null) |
Upvotes: 3
Reputation: 17
I believe this is what you are looking for:
;with cte as (
select id
, eventname
, hometeam
, metric
, CASE WHEN CHARINDEX(HomeTeam, EventName)>0 THEN LAG(Metric) OVER (Partition by HomeTeam ORDER BY id) ELSE NULL END previous from #test_table
)
select * ,CASE WHEN CHARINDEX(HomeTeam, EventName)>0 THEN LAG(previous) OVER (Partition by HomeTeam ORDER BY id) ELSE NULL END previous2
from cte
order by 1
Upvotes: 0
Reputation: 15893
First through self join and common table expression I have ranked all the previous eventname containing hometeam. We can have PreviousMetric1 from previous match and we can use Lead() window function to obtain PreviousMetric2. Please check below query:
with cte as(
select a.id,a.eventname,a.hometeam,a.metric,b.metric PreviousMetric1,
LEAD(b.metric)over (partition by a.id order by b.id desc) PreviousMetric2,
row_number()over(partition by a.id,a.hometeam order by b.id desc) rownum
from #test_table a
left join #test_table b
on charindex(a.hometeam,b.eventname)>0 and a.id>b.id
)select id,eventname,hometeam,metric,PreviousMetric1,PreviousMetric2 from cte
where rownum=1
You can also have PreviousMetric3 applying Lead() with 2 as second parameter. In this way you can have any number of previous metric. It's faster comparing to any other approach.
;with cte as(
select a.id,a.eventname,a.hometeam,a.metric,b.metric PreviousMetric1,
LEAD(b.metric)over (partition by a.id order by b.id desc) PreviousMetric2,
LEAD(b.metric,2)over (partition by a.id order by b.id desc) PreviousMetric3,
row_number()over(partition by a.id,a.hometeam order by b.id desc) rownum
from #test_table a
left join #test_table b
on charindex(a.hometeam,b.eventname)>0 and a.id>b.id
)select id,eventname,hometeam,metric,PreviousMetric1,PreviousMetric2 ,PreviousMetric3 from cte
where rownum=1
Upvotes: 1
Reputation: 175806
Using OUTER APPLY
and correlated subqueries:
SELECT *
FROM test_table c
OUTER APPLY (SELECT TOP 1 PreviousMetric1 = c2.Metric
FROM test_table c2
WHERE CHARINDEX(c.HomeTeam, c2.EventName)>0
AND c.id > c2.id
ORDER BY id DESC) s1
OUTER APPLY (SELECT PreviousMetric2 = c2.Metric
FROM test_table c2
WHERE CHARINDEX(c.HomeTeam, c2.EventName)>0
AND c.id > c2.id
ORDER BY id DESC OFFSET 1 ROWS FETCH NEXT 1 ROW ONLY) s2
ORDER BY id;
Output:
+-----+-------------------+-----------+---------+------------------+-----------------+
| id | EventName | HomeTeam | Metric | PreviousMetric1 | PreviousMetric2 |
+-----+-------------------+-----------+---------+------------------+-----------------+
| 1 | Team A vs Team B | Team A | 5 | | |
| 2 | Team A vs Team B | Team A | 7 | 5 | |
| 3 | Team C vs Team D | Team C | 6 | | |
| 4 | Team Z vs Team A | Team Z | 8 | | |
| 5 | Team A vs Team B | Team A | 9 | 8 | 7 |
| 6 | Team C vs Team D | Team C | 3 | 6 | |
| 7 | Team C vs Team D | Team C | 1 | 3 | 6 |
| 8 | Team E vs Team F | Team E | 2 | | |
+-----+-------------------+-----------+---------+------------------+-----------------+
Extedning with PreviousMetricN
is a matter of adding corresponding OUTER APPLY
sN with OFFSET N-1 ROWS FETCH ...
.
Upvotes: 1
Reputation: 1270081
The logic seems to be:
lag(metric, <n>) over (partition by hometeam order by id)
I don't see why eventName
is needed.
Upvotes: 1