Reputation: 12271
Can someone please help me with the query to get the last record before the Purchase was made. The data set looks like below
UserID EventType ProductID EventTime Price Campaign
123ABC Click P1 5/9/2018 2:33 NULL C1
123ABC Click P1 5/10/2018 2:07 NULL C1
123ABC Click P1 5/16/2018 2:14 NULL C1
123ABC Click P2 5/9/2018 2:33 NULL C1
123ABC Click P2 5/10/2018 2:07 NULL C1
123ABC Click P2 5/16/2018 2:14 NULL C1
123ABC Purchase P2 5/22/2018 4:11 19.44 NULL
123ABC Click P3 5/9/2018 2:33 NULL C1
123ABC Click P3 5/10/2018 2:07 NULL C1
123ABC Click P3 5/11/2018 15:57 NULL C1
123ABC Click P3 5/16/2018 2:14 NULL C1
123ABC Purchase P4 5/22/2018 4:11 31.44 NULL
Output
UserID EventType ProductID EventTime Price Campaign
123ABC Click P2 5/16/2018 2:14 19.44 C1
123ABC NoEvent P4 5/22/2018 4:11 31.44 NULL
I need to find the last record when click happened just before the purchase i.e.
123ABC Click P2 5/16/2018 2:14 NULL C1
and if there are no Clicks before a purchase for a particular product then just output the Purchase record
123ABC NoEvent P4 5/22/2018 4:11 31.44 NULL
I tried to find the sequence to get the group of records which leaded to Purchase using the below query Example
Select *,
row_number()
over(partition by UserId,ProductId order by EventTime) as Ordering
from Purchase
UserID EventType ProductID EventTime Price Campaign Ordering
123ABC Click P1 5/9/2018 2:33 NULL C1 1
123ABC Click P1 5/10/2018 2:07 NULL C1 2
123ABC Click P1 5/16/2018 2:14 NULL C1 3
123ABC Click P2 5/9/2018 2:33 NULL C1 1
123ABC Click P2 5/10/2018 2:07 NULL C1 2
123ABC Click P2 5/16/2018 2:14 NULL C1 3
123ABC Purchase P2 5/22/2018 4:11 19.44 NULL 4
123ABC Click P3 5/9/2018 2:33 NULL C1 1
123ABC Click P3 5/10/2018 2:07 NULL C1 2
123ABC Click P3 5/11/2018 15:57 NULL C1 3
123ABC Click P3 5/16/2018 2:14 NULL C1 4
123ABC Purchase P4 5/22/2018 4:11 31.44 NULL 5
With the above grouping, i need to consider only those groups which have purchase in it and then filter the data. Currently I'm stuck with this approach. Can someone please help me out with the query
Upvotes: 1
Views: 99
Reputation: 2502
create table temp (
EventType varchar(10),
ProductID varchar(10),
Ordering int,
Price int
);
insert into temp
Select EventType, ProductID,
row_number(), Price
over(partition by UserId,ProductId order by EventTime) as Ordering
from Purchase;
select p.UserID, p.EventType, p.ProductID, p.EventTime, t.Price, p.Campaign from (
select ProductID, CASE WHEN Ordering = 1 THEN 1 ELSE Ordering -1 END as Ordering, Price
from temp t
where t.EventType = 'Purchase') a,
(
Select *,
row_number()
over(partition by UserId,ProductId order by EventTime) as Ordering
from Purchase
) p
where a.ProductID = p.ProductID
and a.Ordering = p.Ordering
For sure, it can be improved to reduce some syntax...I think it is not a must to create a new temp table. Besides, is this result you want?
Upvotes: 2
Reputation: 950
WITH CTE_PurchaseEvent AS
(SELECT
UserID,
ProductId,
EventTime,
ISNULL(lag(EventTime,1) over (PARTITION BY UserID, ProductId ORDER BY EventTime), '2000-01-01') AS PreviousPurchaseEventTime
FROM
Purchase
WHERE
EventType = 'Purchase')
,CTE_NonPurchaseEvent AS
(SELECT
Row_Number() OVER (PARTITION BY UserID, ProductId order by EventTime desc) as RowNum,
*
FROM
Purchase
WHERE
EventType != 'Purchase')
SELECT
PurchaseEvent.UserId,
PurchaseEvent.ProductId,
PurchaseEvent.EventTime AS PurchaseTime,
(SELECT MAX(p.EventTime) FROM CTE_NonPurchaseEvent P WHERE P.UserId = PurchaseEvent.UserId AND P.ProductId = PurchaseEvent.ProductId and
P.EventTime > PurchaseEvent.PreviousPurchaseEventTime AND P.EventTime < PurchaseEvent.EventTime) AS LastEventTimeBeforePurchase
FROM
CTE_PurchaseEvent AS PurchaseEvent
This will also cater for scenarios where the same user purchases the same product multiple times. I added some sample data to test the query. Check and let me know if they are not valid scenarios. SQL Fiddle
Upvotes: 1