Reputation: 47
Say, I have a table that I derived in mySQL:
doctor | patient_of_doctor | patient_bought_item_from_doctor_x_days_ago |
---|---|---|
Aaron | Jeff | 10 |
Aaron | Jeff | 20 |
Jess | Jason | 50 |
Jess | Jason | 20 |
Jess | Jason | 30 |
Aaron | Stu | 90 |
Aaron | Stu | 70 |
Aaron | Stu | 110 |
Aaron | Stu | 105 |
Now, I would want to make a new table from this one such that for each doctor, the patient's average buy interval is shown.
For this table,
Jeff bought from Aaron 10 and 20 days ago. Jeff's average buy interval is therefore, (20-10)/1 = 10.
Jason bought from Jess 20, 50 and 30 days ago. Jason's average buy interval is ((50-30)+(30-20))/2 = 15.
Stu bought from Aaron 90, 70, 110 and 105 days ago. Stu's average buy interval is ((110-105)+(105-90)+(90-70))/3 = (5+15+20)/3 =40/3 = 13.33
I would want to output a table that looks like this:
doctor | patient_of_doctor | avg_buy_interval |
---|---|---|
Aaron | Jeff | 10 |
Jess | Jason | 15 |
Aaron | Stu | 13.33 |
I am seriously considering using python to do this but I could not pass up the chance to learn some mySQL from you guys.
Thanks! Umesh
Upvotes: 1
Views: 56
Reputation: 7114
If you're using MySQL v8 (you can run SELECT version();
to check), you may try using either LEAD() or LAG() functions to get the previous or next x_days_ago
. Then you can use AVG()
on the subtraction result between the current row data with its previous or next x_days_ago
.
With LEAD()
:
SELECT doctor,
patient,
AVG(bxd-prev_data)
FROM
(SELECT doctor,
patient,
bought_x_days_ago AS bxd,
LEAD(bought_x_days_ago) /*using LEAD*/
OVER (PARTITION BY doctor, patient
ORDER BY bought_x_days_ago DESC) AS prev_data
FROM /*with ORDER BY in descending*/
mytable) v
WHERE prev_data IS NOT NULL
GROUP BY doctor,
patient;
With LAG()
:
SELECT doctor,
patient,
AVG(bxd-prev_data)
FROM
(SELECT doctor,
patient,
bought_x_days_ago AS bxd,
LAG(bought_x_days_ago) /*using LAG*/
OVER (PARTITION BY doctor, patient
ORDER BY bought_x_days_ago) AS prev_data
FROM /*with ORDER BY in ascending (default)*/
mytable) v
WHERE prev_data IS NOT NULL
GROUP BY doctor,
patient
On older MySQL version, you can try this:
SELECT doctor,
patient,
AVG(prev_bxd-bxd)
FROM
(SELECT m1.doctor, m1.patient, m1.bought_x_days_ago AS bxd,
SUBSTRING_INDEX(
GROUP_CONCAT(m2.bought_x_days_ago ORDER BY m2.bought_x_days_ago),',',1) AS prev_bxd
FROM mytable m1
LEFT JOIN mytable m2
ON m1.doctor=m2.doctor
AND m1.patient=m2.patient
AND m1.bought_x_days_ago < m2.bought_x_days_ago
GROUP BY
m1.doctor,
m1.patient,
m1.bought_x_days_ago) v
GROUP BY doctor,
patient;
I'll try to explain it the best I can. The first component here is the table's self LEFT JOIN
. The condition placed for the join is to match doctor
and patient
data exactly and to match the second table bought_x_days_ago
that is larger from the reference table value. In the SELECT
section, we're using GROUP_CONCAT()
over bought_x_days_ago
from the second table with default ascending order and use SUBSTRING_INDEX()
to get the first value from the GROUP_CONCAT()
result. Without the SUBSTRING_INDEX()
, the result look like this:
doctor | patient | bxd | prev_bxd |
---|---|---|---|
Aaron | Jeff | 10 | 20 |
Aaron | Jeff | 20 | NULL |
Aaron | Stu | 70 | 90,105,110 |
Aaron | Stu | 90 | 105,110 |
Aaron | Stu | 105 | 110 |
Aaron | Stu | 110 | NULL |
Jess | Jason | 20 | 30,50 |
Jess | Jason | 30 | 50 |
Jess | Jason | 50 | NULL |
Then with SUBSTRING_INDEX()
, it becomes like this:
doctor | patient | bxd | prev_bxd |
---|---|---|---|
Aaron | Jeff | 10 | 20 |
Aaron | Jeff | 20 | NULL |
Aaron | Stu | 70 | 90 |
Aaron | Stu | 90 | 105 |
Aaron | Stu | 105 | 110 |
Aaron | Stu | 110 | NULL |
Jess | Jason | 20 | 30 |
Jess | Jason | 30 | 50 |
Jess | Jason | 50 | NULL |
The next step is simple, just make that as a subquery then do the average calculation.
Upvotes: 2
Reputation: 26
To do That You should use the group by clause in sql
which will be somethin like this:
SELECT column1, column2
FROM table_name
WHERE [ conditions ]
GROUP BY column1, column2
ORDER BY column1, column2
which in your case would be this EXACT SQL Code:
SELECT doctor,
patient_of_doctor,
AVG(bought_item) as 'avg_buy_interval'
FROM Patients
GROUP BY doctor, patient_of_doctor
and you will have thi result:
Upvotes: 0