Reputation: 663
I cannot find any source mentioning execution order for Partition By
window functions in SQL.
Is it in the same order as Group By
?
Select *, row_number() over (Partition by Name)
from NPtable
Where Name = 'Peter'
I understand if Where
gets executed first, it will only look at Name = 'Peter'
, then execute window function that just aggregates this particular person instead of entire table aggregation, which is much more efficient.
But when the query is:
Select top 1 *, row_number() over (Partition by Name order by Date)
from NPtable
Where Date > '2018-01-02 00:00:00'
Doesn't the window function need to be executed against the entire table first then applies the Date>
condition otherwise the result is wrong?
Upvotes: 16
Views: 13998
Reputation: 12979
It is part of the SELECT
phase of the query execution. There are different types of SELECT clauses, based on the query.
PARTITION BY comes in the SELECT OVER
clause. Here, a window of the result set is generated out of the result set generated in the previous stages: FROM, WHERE, GROUP BY etc.
The OVER clause defines a window or user-specified set of rows within a query result set. A window function then computes a value for each row in the window. You can use the OVER clause with functions to compute aggregated values such as moving averages, cumulative aggregates, running totals, or a top N per group results.
OVER ( [ PARTITION BY value_expression ] [ order_by_clause ] )
Arguments
PARTITION BY Divides the query result set into partitions. The window function is applied to each partition separately and computation restarts for each partition.
value_expression Specifies the column by which the rowset is partitioned. value_expression can only refer to columns made available by the FROM clause. value_expression cannot refer to expressions or aliases in the select list. value_expression can be a column expression, scalar subquery, scalar function, or user-defined variable.
Defines the logical order of the rows within each partition of the result set. That is, it specifies the logical order in which the window functioncalculation is performed.
order_by_expression Specifies a column or expression on which to sort. order_by_expression can only refer to columns made available by the FROM clause. An integer cannot be specified to represent a column name or alias.
You can read more about it SELECT-OVER
Upvotes: 3
Reputation: 1270723
row_number()
(and other window functions) are allowed in two clauses:
SELECT
ORDER BY
The function is parsed along with the rest of the clause. After all, it is a function present in the clause. In both cases, the WHERE
clause would be -- logically -- applied first, so the results would be after filtering.
Do note that this is a logical parsing of the query. The actual execution may have little to do with the structure of the query.
Upvotes: 3
Reputation: 32693
Window functions are executed/calculated at the same stage as SELECT
, stage 5 in your table. In other words, window functions are applied to all rows that are "visible" in the SELECT
stage.
In your second example
Select top 1 *,
row_number() over (Partition by Name order by Date)
from NPtable
Where Date > '2018-01-02 00:00:00'
WHERE
is logically applied before Partition by Name
of the row_number()
function.
Note, that this is logical order of processing the query, not necessarily how the engine physically processes the data.
If query optimiser decides that it is cheaper to scan the whole table and later discard dates according to the WHERE
filter, it can do it. But, any kind of these transformations must be performed in such a way that the final result is consistent with the order of the logical steps outlined in the table you showed.
Upvotes: 25