Calculate data dynamically of same row of column with conditions in SQL

I have a table with data of customer with interaction to touchpoint . I need to calculate the time-difference of customer & identify the customer as repeat if the customer is touching the point with in 180 seconds , but if the customer come again after 180 seconds of last interaction it will be termed as Unique & the counter will start again ..

Input Table

UID CREATION_DATE
Cust 1 13-06-2023 15:12
Cust 1 13-06-2023 18:43
Cust 1 13-06-2023 18:43
Cust 2 08-06-2023 22:25
Cust 2 10-06-2023 14:32
Cust 3 12-06-2023 10:23
Cust 4 13-06-2023 18:57
Cust 4 13-06-2023 18:58
Cust 4 13-06-2023 18:58

I have calculate the the time difference using timestamp & lag function using

SELECT unique_cust_id as UID ,creation_date,
   ROW_NUMBER() OVER ( PARTITION BY unique_cust_id  ORDER BY subs_msisdn ,creation_date aSC ) row_num,
   timestampdiff ( 2,creation_date - lag (creation_date) OVER (PARTITION BY unique_cust_id  ORDER BY creation_date ))
   AS diff 
   FROM DB2PROJ.TESTING_REPEAT

with output

UID CREATION_DATE ROW_NUM DIFF_time
Cust 1 13-06-2023 15:12 1
Cust 1 13-06-2023 18:43 2 12665
Cust 1 13-06-2023 18:43 3 14
Cust 2 08-06-2023 22:25 1
Cust 2 10-06-2023 14:32 2 144432
Cust 3 12-06-2023 10:23 1
Cust 4 13-06-2023 18:57 1
Cust 4 13-06-2023 18:58 2 34
Cust 4 13-06-2023 18:58 3 43

Desired Output

UID CREATION_DATE ROW_NUM DIFF_time Cum_Diff Category
Cust 1 13-06-2023 15:12 1 0 Unique
Cust 1 13-06-2023 18:43 2 12665 0 Unique
Cust 1 13-06-2023 18:43 3 14 14 Repeat
Cust 2 08-06-2023 22:25 1 0 Unique
Cust 2 10-06-2023 14:32 2 144432 0 Unique
Cust 3 12-06-2023 10:23 1 0 Unique
Cust 4 13-06-2023 18:57 1 0 Unique
Cust 4 13-06-2023 18:58 2 34 34 Repeat
Cust 4 13-06-2023 18:58 3 43 77 Repeat

Above cum_diff is calculated in excel where if row_num is 1 . termed as 0 & Unique in category , cumulative addition is done of diff_time & if its >180 then it termed as 0 in cum_diff & Unique in category & if its with in <=180 then it added & termed as repeat in category

Please help to build in excel

Upvotes: 0

Views: 82

Answers (1)

ahmed
ahmed

Reputation: 9181

For the category, you just need a case expression to check if the time difference between the current time and the previous time is greater than 180. For your cumulative sum of time differences, you have a gaps and islands problem, one approach to solve it is the use of a running sum that increased by 1 whenever the time difference is exceeding 180.

WITH t AS
(
  SELECT unique_cust_id as UID ,creation_date,
       ROW_NUMBER() OVER (PARTITION BY unique_cust_id  ORDER BY creation_date ) row_num,
       timestampdiff(second, 
             lag(creation_date, 1, creation_date) OVER (PARTITION BY unique_cust_id  ORDER BY creation_date), creation_date) AS DIFF_time
 FROM TESTING_REPEAT
),
grps AS
(
  SELECT *,
    CASE WHEN DIFF_time > 180 OR row_num = 1 THEN 'Unique' ELSE 'Repeat' END AS Category,
    SUM(CASE WHEN DIFF_time > 180 THEN 1 ELSE 0 END) OVER (PARTITION BY UID  ORDER BY creation_date) grp
  FROM t
)
SELECT UID, creation_date, row_num, DIFF_time,
       SUM(CASE WHEN DIFF_time <= 180 THEN DIFF_time ELSE 0 END) OVER (PARTITION BY UID, grp  ORDER BY creation_date) AS Cum_Diff,
       Category, grp
FROM grps

demo on MySQL

Upvotes: 0

Related Questions