keewee279
keewee279

Reputation: 1654

Oracle SQL: Select Distinct combination of two columns among other columns

I am an Oracle SQL beginner and hope someone here can help me with this.

I have a pretty basic Select query which works so far. This is selecting click data, i.e. how often a user clicks on a certain class of buttons on a website.

My problem is that I need to exclude duplicate or multiple clicks from the same user but only those that are duplicates.

Each user has a unique ID (CUSTOMER_ID) and I can get either the HIT_DAY or the HIT_DATETIME to identify the time of the click. However, there are scenarios where a user needs to click multiple buttons of the same class the same day.

Is there something along the lines of "get me only records where the user (CUSTOMER_ID) is different OR where the click time (HIT_DATETIME) is unique for +/- 10 seconds ? Or anything else I could do here ? I don't have other unique identifiers.

My query:

SELECT
    owh.HIT_DAY
    , owh.HIT_DATETIME
    , COUNT(owh.CUSTOMER_ID) AS COUNT_CUSTOMER_ID
    , owh.IS_P_CUSTOMER AS P_CUSTOMER
    , owh.SHORTENED_URL AS URL_SHORTENED
    , owh.MP
FROM
    O_WIN_HITS owh
WHERE 1=1
    AND ...
    AND ...
    AND ...
    AND ...
GROUP BY
    owh.HIT_DAY
    , owh.HIT_DATETIME
    , owh.IS_P_CUSTOMER
    , owh.SHORTENED_URL
    , owh.MP
ORDER BY
    owh.HIT_DAY DESC

Note: This is part of a larger query.

Many thanks in advance for any help, Mike

Upvotes: 0

Views: 614

Answers (1)

Gordon Linoff
Gordon Linoff

Reputation: 1269773

I think you want something like this:

select owh.*
from (select owh.*,
             lag(hit_datetime) over (partition by customer_id order by hit_datetime) as prev_hdt
      from O_WIN_HITS owh
     ) owh
where prev_hdt is null or
      hit_datetime > prev_hdt + 10 / (24 * 60 * 60);

This returns rows that are either the first row for the customer or appear for a customer more than 10 seconds after the previous hit.

Upvotes: 1

Related Questions