Alecs
Alecs

Reputation: 2316

ORDER BY ROW_NUMBER

UPD: thanks for all, topic closed, after sleeping I understand everything =)

I have a problem with understanding OVER clause and and ROW_NUMBER function. Simple table - name and mark. I want to calculate average mark for each name.

SELECT top 1 with ties name,  ROW_NUMBER() over (PARTITION BY name ORDER BY name) as number
FROM table 
ORDER BY AVG(mark) OVER(PARTITION BY name)

it will display something like this, and I understand why - that is what ROW_NUMBER() does

name|number
Pete 1
Pete 2

But if I write

SELECT top 1 with ties name,  ROW_NUMBER() over (PARTITION BY name ORDER BY name) as number
FROM table 
ORDER BY AVG(mark) OVER(PARTITION BY name), number

it will display

name|number
Pete 1

And this time I don't understand how ORDER BY works with ROW_NUMBER() function. Can somebody explain it to me?

Upvotes: 7

Views: 61857

Answers (2)

beach
beach

Reputation: 8640

You can certainly order by ROW_NUMBER column because the SELECT clause is evaluated before the ORDER BY clause. You can ORDER BY any column or column alias. This is why no error message was thrown (because it is valid).

SELECT name,  ROW_NUMBER() over (PARTITION BY name ORDER BY name) as number
FROM @table 
ORDER BY number

Evaluates to

name       number
---------- --------------------
John       1
pete       1
pete       2
John       2
pete       3

OP's second example of row_number is not correct.

SELECT AVG(mark) OVER(PARTITION BY name), name,  ROW_NUMBER() over (PARTITION BY name ORDER BY name) as number
FROM @table 
ORDER BY AVG(mark) OVER(PARTITION BY name), number

Returns as expected because AVG is the first sort column followed by number.

            name       number
----------- ---------- --------------------
11          pete       1
11          pete       2
11          pete       3
17          John       1
17          John       2

Change the query to number DESC and pete is still first however the row numbers are descending order.

            name       number
----------- ---------- --------------------
11          pete       3
11          pete       2
11          pete       1
17          John       2
17          John       1

SQL Order of operations

FROM clause
WHERE clause
GROUP BY clause
HAVING clause
SELECT clause
ORDER BY clause

Upvotes: 15

gbn
gbn

Reputation: 432261

You can't ORDER BY the ROW_NUMBER directly: I don't know why you didn't get an error on this case, but normally you would. Hence the use of derived tables or CTEs

SELECT
    name, number
FROM
    (
    SELECT
       name,
       ROW_NUMBER() OVER (PARTITION BY name ORDER BY name) as number,
       AVG(mark) OVER (PARTITION BY name) AS nameavg
    FROM table
    ) foo
ORDER BY
   nameavg, number

However, PARTITION BY name ORDER BY name is meaningless. Each partition has random order because the sort is the partition

I suspect you want something like this where ROW_NUMBER is based on AVG

SELECT
    name, number
FROM
    (
    SELECT
       name,
       ROW_NUMBER() OVER (PARTITION BY name ORDER BY nameavg) AS number
    FROM
        (
        SELECT
           name,
           AVG(mark) OVER (PARTITION BY name) AS nameavg
        FROM table
        ) foo
    ) bar
ORDER BY
    number

Or more traditionally (but name is collapsed for the average)

SELECT
    name, number
FROM
    (
    SELECT
       name,
       ROW_NUMBER() OVER (PARTITION BY name ORDER BY nameavg) AS number
    FROM
        (
        SELECT
           name,
           AVG(mark) AS nameavg
        FROM
           table
        GROUP BY
           name
        ) foo
    ) bar
ORDER BY
    number

You can maybe collapse the derived foo and bar into one with

ROW_NUMBER() OVER (PARTITION BY name ORDER BY AVG(mark))

But none of this makes sense: I understand that your question is abstract about how it works bit it is unclear question. It would make more sense if you described what you want in plain English and with sample input and output

Upvotes: 3

Related Questions