Reputation: 1654
I am an Oracle SQL beginner and hope someone here can help me with this.
I have a pretty basic Select query which works so far. This is selecting click data, i.e. how often a user clicks on a certain class of buttons on a website.
My problem is that I need to exclude duplicate or multiple clicks from the same user but only those that are duplicates.
Each user has a unique ID (CUSTOMER_ID
) and I can get either the HIT_DAY or the HIT_DATETIME
to identify the time of the click.
However, there are scenarios where a user needs to click multiple buttons of the same class the same day.
Is there something along the lines of "get me only records where the user (CUSTOMER_ID
) is different OR where the click time (HIT_DATETIME
) is unique for +/- 10 seconds ? Or anything else I could do here ?
I don't have other unique identifiers.
My query:
SELECT
owh.HIT_DAY
, owh.HIT_DATETIME
, COUNT(owh.CUSTOMER_ID) AS COUNT_CUSTOMER_ID
, owh.IS_P_CUSTOMER AS P_CUSTOMER
, owh.SHORTENED_URL AS URL_SHORTENED
, owh.MP
FROM
O_WIN_HITS owh
WHERE 1=1
AND ...
AND ...
AND ...
AND ...
GROUP BY
owh.HIT_DAY
, owh.HIT_DATETIME
, owh.IS_P_CUSTOMER
, owh.SHORTENED_URL
, owh.MP
ORDER BY
owh.HIT_DAY DESC
Note: This is part of a larger query.
Many thanks in advance for any help, Mike
Upvotes: 0
Views: 614
Reputation: 1269773
I think you want something like this:
select owh.*
from (select owh.*,
lag(hit_datetime) over (partition by customer_id order by hit_datetime) as prev_hdt
from O_WIN_HITS owh
) owh
where prev_hdt is null or
hit_datetime > prev_hdt + 10 / (24 * 60 * 60);
This returns rows that are either the first row for the customer or appear for a customer more than 10 seconds after the previous hit.
Upvotes: 1