Reputation: 2487
I'm working with some web clicks data, and am just looking for the most recent page_name with the user_id visited (by a timestamp). Using the below code, the user_id is repeated and page_name with shown, with sorted descending. However, I would just like recent_click always = 1. The query when complete will be used as a subquery in a larger query.
Here is my current code:
SELECT user_id,
page_name,
row_number() over(partition by session_id order by ts desc) as recent_click
from clicks_data;
user_id | page_name | recent_click
--------+-------------+--------------
0001 | login | 1
0001 | login | 2
0002 | home | 1
Upvotes: 12
Views: 52192
Reputation: 62861
You should be able to move your query to a subquery and add where
criteria:
SELECT user_id, page_name, recent_click
FROM (
SELECT user_id,
page_name,
row_number() over (partition by session_id order by ts desc) as recent_click
from clicks_data
) T
WHERE recent_click = 1
Upvotes: 20
Reputation: 3453
You should move the row_number()
function into a subquery and then filter it in the outer query.
Something like this:
SELECT * FROM (
SELECT
[user_id]
,[page_name]
,ROW_NUMBER() OVER (PARTITION BY [session_id]
ORDER BY [ts] DESC) AS [recent_click]
FROM [clicks_data]
)x
WHERE [recent_click] = 1
Upvotes: 2