Reputation: 3977
I am a little confused as to where would be the preferred place to add the condition when using CTE
with ROW_NUMBER OVER PARTITION
.
I have a table containing the following columns:
UserID, BranchNumber, MemberDate
and MemberStatus
Note: A member can have multiple memberships at different locations:
The following code gives me one less record: 17069
WITH CTE AS
(
SELECT
*
,ROW_NUMBER() OVER (PARTITION BY UserID ORDER BY [MemberDate] DESC) AS RowNumber
FROM MemberTable
WHERE BranchNumber = '01'
)
SELECT * FROM CTE WHERE RowNumber = 1 AND MemberStatus = 'Active'
The following code gives one additional record: 17070
WITH CTE AS
(
SELECT
*
,ROW_NUMBER() OVER (PARTITION BY UserID ORDER BY [MemberDate] DESC) AS RowNumber
FROM MemberTable
WHERE BranchNumber = '01' AND MemberStatus = 'Active'
)
SELECT * FROM CTE WHERE RowNumber = 1
I am just confused as to why the difference and which is the right way?
The correct amount of records is 19000.
Upvotes: 4
Views: 8436
Reputation: 69524
Since you have mentioned "A member can have multiple memberships at different locations" , try partition it by the "BranchNumber"
as well and see if it gives you the result you are looking for.
The way you are doing it, it is returning the Very last Membership start date for any location, if you partition by the BranchNumber it will return the very last Membership start date for each user for each Branch Number , try something like this
;WITH
CTE
AS
(
SELECT *
,ROW_NUMBER() OVER (PARTITION BY UserID, BranchNumber ORDER BY [MemberDate] DESC)
AS RowNumber
FROM MemberTable
)
SELECT * FROM CTE
WHERE RowNumber = 1 AND MemberStatus = 'Active'
Upvotes: 0
Reputation: 171178
The ROW_NUMBER
is evaluated after the WHERE
. Let's make it more intuitive by splitting the two apart:
WITH CTE1 AS
(
SELECT
*
FROM MemberTable
WHERE BranchNumber = '01' AND MemberStatus = 'Active'
)
, CTE2 AS (
SELECT *, ROW_NUMBER() OVER (PARTITION BY UserID ORDER BY [MemberDate] DESC) AS RowNumber
FROM CTE1
)
SELECT * FROM CTE2 WHERE RowNumber = 1
I think it is pretty clear now that you need the filter applied first. Think of the CTE's as virtual tables. You can debug this by changing the final select to SELECT * FROM CTE1
. Just look at what is returned.
Upvotes: 1
Reputation: 18759
The queries are different, so I would expect the records to be different.
The first query says:
BranchNumber = 1
, order by the MemberDate
and partition on userId
MemberStatus is Active
The second query says:
activemember
records from BranchNumber = 1
, order by the MemberDate
and partition on userId
So, the first result set may be getting the latest record from the user, inactive or not, and then you filter out the inactive ones. The second will only return active records, so I would expect more.
Upvotes: 0
Reputation: 37364
Both are "right" in a sense they return what is asked. (2) gives more records because you apply extra condition (MemberStatus = 'Active'
) in the subquery(cte). So record where MemberStatus are not equal to "Active" cannot have "RowNumber=1" . (1) doesn't filter such rows in CTE, so it's possible that it returns record[s] with RowNumber=1 and MemberStatus <> 'Active'
which got removed from final resultset by applying condition in outer query.
Upvotes: 1