Reputation: 307
I have two tables (one that shows a record of all the programs sent to a customer, and another that shows a record of all the programs the user took action on). My ultimate goal is to count how many of those programs that were sent to the user were acted upon. I have a way of doing that right now but there is a problem. The programs may be sent to the user multiple times (I will use 20 for an example) within a single day. If the user takes action on that program within 24 hours, I don't want it to count as 19 programs that weren't acted on and 1 program that was acted on. I want it to count as 1 acted upon program with 0 fails.
The table that contains the programs sent to the customer needs to be filtered down so that each program that is sent to the customer only shows once per 24 hour period.
My solution is to have a "window" type thing so that once a program is sent to a customer, it is 'locked' for that customer for 24 hours so it won't appear in my 'programs sent to customer' query multiple times.
I have a table like this:
Customer Time Program
-----------------------------------
1 8:05 a
1 10:30 a
1 11:30 a
1 12:30 b
1 1:25 a
2 9:38 b
2 10:38 c
2 1:36 c
2 2:40 c
2 3:41 b
.
.
.
I want to get a table (query, not delete) that removes duplicate programs per customer within a certain time frame (confusing! I know)
Here is what I want (with a 3hr time frame for example):
Customer Time Program
-----------------------------------
1 8:05 a
1 11:30 a
1 12:30 b
2 9:38 b
2 10:38 c
2 2:40 c
2 3:41 b
.
.
.
Upvotes: 3
Views: 633
Reputation: 115510
I think that what you mean can be solved (only) with a recursive query. Here is a solution:
WITH cte AS
( SELECT
customer, program, time,
ROW_NUMBER()
OVER (PARTITION BY customer, program
ORDER BY time)
AS rn,
MIN(time)
OVER (PARTITION BY customer, program
ORDER BY time
RANGE BETWEEN 3.0/24 FOLLOWING
AND UNBOUNDED FOLLOWING)
AS next_time
FROM a
)
SELECT
customer, time, program
FROM
cte
START WITH rn = 1
CONNECT BY PRIOR customer = customer
AND PRIOR program = program
AND PRIOR next_time = time
ORDER BY
customer, time, program ;
You can also replace MIN(time)
with FIRST_VALUE(time)
above and get the same results. It might be more efficient.
Tested at SQL-Fiddle
The efficiency on a big table is probably not going to be very good. You can try running the query with smaller set of data.
And you should at least add this index, so it does an index scan:
CREATE INDEX ix -- choose a name for the index
ON tableX -- the table name
(customer, program, time) ;
You could also skip the final ordering or change it so it is more similar to the used index:
ORDER BY
customer, program, time ;
Upvotes: 1
Reputation: 1269443
Try this:
select *
from t
where not exists (select 1 from t t2
where t2.customer = t.customer and
t2.program = t.program and
t2.time - t.time < 3.0/24 and
t2.time > t.time
)
Date time arithmetic depends on databases, but this works for many of them.
Upvotes: 2