Reputation: 123
I am new to SQL Server and learning things with the help of Hacker Rank
Question - https://www.hackerrank.com/challenges/occupations/problem
query (with this we get all the rows)-
SELECT
[Doctor], [Professor], [Singer], [Actor]
FROM
(
SELECT ROW_NUMBER() OVER (PARTITION BY OCCUPATION ORDER BY NAME) [RowNumber], * FROM OCCUPATIONS
) AS tempTable
PIVOT
(
MAX(NAME) FOR OCCUPATION IN ([Doctor], [Professor], [Singer], [Actor])
) AS pivotTable
with out the subquery (we get only one row as min or max takes one row from each occupation)-
SELECT [Doctor], [Professor], [Singer], [Actor]
FROM occupations as t
PIVOT
(
max(name)
FOR occupation IN ([Doctor], [Professor], [Singer], [Actor])
) AS P
so, what is the difference between two queries? how does SELECT ROW_NUMBER() OVER (PARTITION BY OCCUPATION ORDER BY NAME) [RowNumber], * FROM OCCUPATIONS
make any difference?
Upvotes: 0
Views: 480
Reputation: 1157
This is a good question. ROW-NUMBER() is used in writing many SQL queries. Here what matters is partition by clause. This query uses occupation to partition where it groups each occupation and assign the row number. So, we get a result like this,
Then it sends the result to pivot, so it pivots the partitions and select the max(name) from partitions. in this case we get multiple rows with Null for the missing values.
But when we use select table query without windows aggregation and occupation parttion, We will get only one raw as it finds the the possible result from all records by based on occupaton. This case we will end up with one raw which contains the max(name)
Result:
Therefore, remember the column we use for partition should be considered in this kind of queries. It makes sense.
Upvotes: 1
Reputation: 4187
You only have the columns Name and Occupation - if you pivot by Occupation and aggregate the name only one row will be returned. By adding the row_number the aggregation will not reduce the output to a single row since each row has a differnet row_number.
Upvotes: 1