Reputation: 71
I have a table as follow on the image screenshot of my table
customer_id purchase_date Difference between purchases (in days)
= WHAT I AM TRYING TO GET
1 23/04/2017 0 (first )row
1 24/04/2017 1
1 01/01/2018 252
2 03/05/2017 0 (this is a new customer)
2 10/05/2017 7
I want to calculate the time difference between two purchases for the same customer. I tried using the LAG and LEAD functions, but I get syntax error that I do not understand
For now I have been doing this:
SELECT customer_id, purchase_date
case
when lag(purchase_date,1,0) over (partition by customer_id order by
purchase_date) = 0 then 0
ELSE purchase_date -lag(purchase_date,1,0) over(partition by customer_id order
by purchase_date)
end
FROM
Table1
It gives me a syntax error that I dont understand just after the first "over"
Upvotes: 0
Views: 1280
Reputation: 11602
Edited 19-10-2019
Since 19-04-2018 MySQL 8 was released, which is much better then this answer to use MySQL's user variables to emulate/simulate LEAD()/LAG() window functions.
So if possible i advice to update your current MySQL version to MySQL 8 if possible.
Current MySQL versions do not support window functions like LAG and LEAD.
MySQL 8.0+ which is now a release candidate will support window functions but is not production ready jet.
In current MySQL versions you can simulate LAG with MySQL's user variables or with co-related subqueries.
Create table/insert data
CREATE TABLE Table1
(`customer_id` int, `purchase_date` varchar(10))
;
INSERT INTO Table1
(`customer_id`, `purchase_date`)
VALUES
(1, '23/04/2017'),
(1, '24/04/2017'),
(1, '01/01/2018'),
(2, '03/05/2017'),
(2, '10/05/2017')
;
The trick with MySQL user variables is it properly initialize them.
Query
SELECT
*
, (@customer_id := Table1.customer_id) AS init_customer_id_param
, (@purchase_date := Table1.purchase_date) AS init_purchase_date_param
FROM
Table1
CROSS JOIN (
SELECT
@customer_id := NULL
, @purchase_date := NULL
)
AS init_user_params
Result
| customer_id | purchase_date | @customer_id := NULL | @purchase_date := NULL | init_customer_id_param | init_purchase_date_param |
|-------------|---------------|----------------------|------------------------|------------------------|--------------------------|
| 1 | 23/04/2017 | (null) | (null) | 1 | 23/04/2017 |
| 1 | 24/04/2017 | (null) | (null) | 1 | 24/04/2017 |
| 1 | 01/01/2018 | (null) | (null) | 1 | 01/01/2018 |
| 2 | 03/05/2017 | (null) | (null) | 2 | 03/05/2017 |
| 2 | 10/05/2017 | (null) | (null) | 2 | 10/05/2017 |
Now you can add the calculation part.
Keep in mind the order is important the calculation needs to be done before the initialisation of the MySQL's user variables.
So the MySQL user variables have the previous column value.
Query
SELECT
*
, (
CASE
WHEN (@customer_id = Table1.customer_id)
THEN DATEDIFF(STR_TO_DATE(purchase_date, "%d/%m/%Y"), STR_TO_DATE(@purchase_date, "%d/%m/%Y"))
END
) AS diff
, (@customer_id := Table1.customer_id) AS init_customer_id_param
, (@purchase_date := Table1.purchase_date) AS init_purchase_date_param
FROM
Table1
CROSS JOIN (
SELECT
@customer_id := NULL
, @purchase_date := NULL
)
AS init_user_params
ORDER BY
STR_TO_DATE(purchase_date, "%d/%m/%Y") ASC
Note
iám using STR_TO_DATE function to format the varchar based dateformat into the datetime format.
If your date column is already is a date datatype you can remove the function and use THEN DATEDIFF(purchase_date, @purchase_date)
instead.
Result
| customer_id | purchase_date | @customer_id := NULL | @purchase_date := NULL | diff | init_customer_id_param | init_purchase_date_param |
|-------------|---------------|----------------------|------------------------|--------|------------------------|--------------------------|
| 1 | 23/04/2017 | (null) | (null) | (null) | 1 | 23/04/2017 |
| 1 | 24/04/2017 | (null) | (null) | 1 | 1 | 24/04/2017 |
| 1 | 01/01/2018 | (null) | (null) | 252 | 1 | 01/01/2018 |
| 2 | 03/05/2017 | (null) | (null) | (null) | 2 | 03/05/2017 |
| 2 | 10/05/2017 | (null) | (null) | 7 | 2 | 10/05/2017 |
Now it simply selecting the correct columns.
Query
SELECT
Table1_user_params.customer_id
, Table1_user_params.purchase_date
, (
CASE
WHEN Table1_user_params.diff IS NULL
THEN 0
ELSE Table1_user_params.diff
END
)
AS diff
FROM (
SELECT
*
, (
CASE
WHEN (@customer_id = Table1.customer_id)
THEN DATEDIFF(STR_TO_DATE(purchase_date, "%d/%m/%Y"), STR_TO_DATE(@purchase_date, "%d/%m/%Y"))
END
) AS diff
, (@customer_id := Table1.customer_id) AS init_customer_id_param
, (@purchase_date := Table1.purchase_date) AS init_purchase_date_param
FROM
Table1
CROSS JOIN (
SELECT
@customer_id := NULL
, @purchase_date := NULL
)
AS init_user_params
ORDER BY
STR_TO_DATE(purchase_date, "%d/%m/%Y") ASC
)
AS Table1_user_params
Result
| customer_id | purchase_date | diff |
|-------------|---------------|------|
| 1 | 23/04/2017 | 0 |
| 1 | 24/04/2017 | 1 |
| 1 | 01/01/2018 | 252 |
| 2 | 03/05/2017 | 0 |
| 2 | 10/05/2017 | 7 |
Upvotes: 2