Reputation: 29159
I thought the following three SQL statements are semantically the same. The database engine will expand the second and third query to the first one internally.
select ....
from T
where Id = 1
select *
from
(select .... from T) t
where Id = 1
select *
from
(select .... from T where Id = 1) t
However, I found the window function behaves differently. I have the following code.
-- Prepare test data
with t1 as
(
select *
from (values ( 2, null), ( 3, 10), ( 5, -1), ( 7, null), ( 11, null), ( 13, -12), ( 17, null), ( 19, null), ( 23, 1759) ) v ( id, col1 )
)
select *
into #t
from t1
alter table #t add primary key (id)
go
The following query returns all the rows.
select
id, col1,
cast(substring(max(cast(id as binary(4)) + cast(col1 as binary(4)))
over (order by id
rows between unbounded preceding and 1 preceding), 5, 4) as int) as lastval
from
#t
id col1 lastval
-------------------
2 NULL NULL
3 10 NULL
5 -1 10
7 NULL -1
11 NULL -1
13 -12 -1
17 NULL -12
19 NULL -12
23 1759 -12
Without CTE/subquery: then I added a condition just return the row which Id = 19
.
select
id, col1,
cast(substring(max(cast(id as binary(4)) + cast(col1 as binary(4))) over (order by id rows between unbounded preceding and 1 preceding), 5, 4) as int) as lastval
from
#t
where
id = 19;
However, lastval
returns null
?
With CTE/subquery: now the condition is applied to the CTE:
with t as
(
select
id, col1,
cast(substring(max(cast(id as binary(4)) + cast(col1 as binary(4))) over (order by id rows between unbounded preceding and 1 preceding ), 5, 4) as int) as lastval
from
#t)
select *
from t
where id = 19;
-- Subquery
select
*
from
(select
id, col1,
cast(substring(max(cast(id as binary(4)) + cast(col1 as binary(4))) over (order by id rows between unbounded preceding and 1 preceding), 5, 4) as int) as lastval
from
#t) t
where
id = 19;
Now lastval
returns -12
as expected?
Upvotes: 2
Views: 1761
Reputation: 17020
The logic order of operations of the SELECT statement is import to understand the results of your first example. From the Microsoft documentation, the order is, from top to bottom:
- FROM
- ON
- JOIN
- WHERE
- GROUP BY
- WITH CUBE or WITH ROLLUP
- HAVING
- SELECT
- DISTINCT
- ORDER BY
- TOP
Note that the WHERE
clause processing happens logically before the SELECT
clause.
The query without the CTE is being filtered where id = 19
. The order of operations causes the where
to process before the window function in the select
clause. There is only 1 row with an id
of 19. Therefore, the where
limits the rows to id = 19 before the window function can process the rows between unbounded preceding and 1 preceding
. Since there are no rows for the window function, the lastval
is null
.
Compare this to the CTE. The outer query's filter has not yet been applied, so the CTE operates an all of the data. The rows between unbounded preceding
finds the prior rows. The outer part of the query applies the filter to the intermediate results returns just the row 19 which already has the correct lastval
.
You can think of the CTE as creating a temporary #Table with the CTE data in it. All of the data is logically processed into a separate table before returning data to the outer query. The CTE in your example creates a temporary work table with all of the rows that includes the lastval
from the prior rows. Then, the filter in the outer query gets applied and limits the results to id
19.
(In reality, the CTE can shortcut and skip generating data, if it can do so to improve performance without affecting the results. Itzik Ben-Gan has a great example of a CTE that skips processing when it has returned enough data to satisfy the query.)
Consider what happens if you put the filter in the CTE. This should behave exactly like the first example query that you provided. There is only 1 row with an id = 19, so the window function does not find any preceding rows:
with t as ( select id, col1,
cast(substring(max(cast(id as binary(4)) + cast(col1 as binary(4))) over ( order by id
rows between unbounded preceding and 1 preceding ), 5, 4) as int) as lastval
from #t
where id = 19 -- moved filter inside CTE
)
select *
from t
Upvotes: 2
Reputation: 3
The querys you are comparing are not equivalent.
select id ,
(... ) as lastval
from #t
where id = 19;
will take only 1 row, so 'lastval' will take NULL from col1 as for the windowed function does not find preceding row.
Upvotes: 0
Reputation: 38023
Window functions operate on your result set, so when you added where id = 19
your result set only had 1 row. Since your window function specifies rows between unbounded preceding and 1 preceding
there was no preceding row, and resulted in null
.
By using the subquery/cte you are allowing the window function to operate over the unfiltered result set (where the preceding rows exist), then retrieving only those rows from that result set where id = 19
.
Upvotes: 0