Reputation: 306
I'm tracking number of steps/day. I want to get the average steps/day using the 5 best days out of a 7 day period. My end goal is going to be to get an average for the best 5 out of 7 days for a total of 16 weeks.
Here's my sqlfiddle - http://sqlfiddle.com/#!9/5e69bdf/2
Here is the query I'm currently using but I've discovered the result is not correct. It's taking the average of 7 days instead of selecting the 5 days that had the most steps. It's outputting 14,122 as an average instead of 11,606 based on my data as posted in the sqlfiddle.
SELECT SUM(a.steps) as StepsTotal, AVG(a.steps) AS AVGSteps
FROM (SELECT * FROM activities
JOIN Courses
WHERE activities.encodedid=? AND activities.activitydate BETWEEN
DATE_ADD(Courses.Startsemester, INTERVAL $y DAY) AND
DATE_ADD(Courses.Startsemester, INTERVAL $x DAY)
ORDER BY activities.steps DESC LIMIT 5
) a
GROUP BY a.encodedid
Here's the same query with the values filled in for testing:
SELECT SUM(a.steps) as StepsTotal, AVG(a.steps) AS AVGSteps
FROM (SELECT * FROM activities
JOIN Courses
WHERE activities.encodedid='42XPC3' AND activities.activitydate BETWEEN
DATE_ADD(Courses.Startsemester, INTERVAL 0 DAY) AND
DATE_ADD(Courses.Startsemester, INTERVAL 6 DAY)
ORDER BY activities.steps DESC LIMIT 5
) a
GROUP BY a.encodedid
Upvotes: 1
Views: 1151
Reputation: 147166
As @SloanThrasher pointed out, the reason the query is not working is because you have multiple rows for the same course in the Courses
database which end up being joined to the activities
database. Thus the output for the subquery gives the top value (16058) 3 times plus the second highest value (11218) twice for a total of 70610 and an average of 14122. You can work around this by modifying the query as follows:
SELECT SUM(a.steps) as StepsTotal, AVG(a.steps) AS AVGSteps
FROM (SELECT * FROM activities
JOIN (SELECT DISTINCT Startsemester FROM Courses) c
WHERE activities.encodedid='42XPC3' AND activities.activitydate BETWEEN
DATE_ADD(c.Startsemester, INTERVAL 0 DAY) AND
DATE_ADD(c.Startsemester, INTERVAL 6 DAY)
ORDER BY CAST(activities.steps AS UNSIGNED) DESC LIMIT 5
) a
GROUP BY a.encodedid
Now since there are actually only 3 days with activity (2018-07-16, 2018-07-17 and 2018-07-18) between the start of semester and 6 days later (2018-07-12 and 2018-07-18) this gives a total of 37533 (16058+11218+10277) and an average of 12517.7.
StepsTotal AVGSteps
37553 12517.666666666666
Ideally, you probably also want to add a constraint on the Course chosen from Courses
e.g. change
(SELECT DISTINCT Startsemester FROM Courses)
to
(SELECT DISTINCT Startsemester FROM Courses WHERE CourseNumber='PHED1164')
Upvotes: 2
Reputation: 33945
Consider the following:
DROP TABLE IF EXISTS my_table;
CREATE TABLE `my_table`
(id SERIAL PRIMARY KEY
,steps INT NOT NULL
);
insert into my_table (steps) values
(9),(5),(7),(7),(7),(8),(4);
select prev
, sum(steps) total
from (
select steps
, case when @prev = grp
then @j:=@j+1 else @j:=1 end j
, @prev:=grp prev
from (SELECT steps
, case when mod(@i,3)=0
then @grp := @grp+1 else @grp:=@grp end grp -- a 3 day week
, @i:=@i+1 i
from my_table
, (select @i:=0,@grp:=0) vars
order
by id) x
, (select @prev:= null, @j:=0) vars
order by grp,steps desc,i) a
where j <=2 -- top 2 (out of 3)
group by prev;
+------+-------+
| prev | total |
+------+-------+
| 1 | 16 |
| 2 | 15 |
| 3 | 4 |
+------+-------+
http://sqlfiddle.com/#!9/ee46d7/11
Upvotes: 0
Reputation: 37367
Try this query:
SELECT @rn := 1, @weekAndYear := 0;
SELECT weekDayAndYear,
SUM(steps),
AVG(steps)
FROM (
SELECT @weekAndYear weekAndYearLag,
CASE WHEN @weekAndYear = YEAR(activitydate) * 100 + WEEK(activitydate)
THEN @rn := @rn + 1 ELSE @rn := 1 END rn,
@weekAndYear := YEAR(activitydate) * 100 + WEEK(activitydate) weekDayAndYear,
steps,
lightly_act_min,
fairly_act_min,
sed_act_min,
vact_min,
encodedid,
activitydate,
username
FROM activities
ORDER BY YEAR(activitydate) * 100 + WEEK(activitydate), CAST(steps AS UNSIGNED) DESC
) a WHERE rn <= 5
GROUP BY weekDayAndYear
With additional variables, I imitate SQL Server ROW_NUMBER
function, to number from 1 to 7 days partitioned by weeks. This way I can filter best 5 days and easily get a average grouping by column weekAndDate
, which is in the same format as variable: yyyyww
(i used integer to avoid casting to varchar
).
Upvotes: 1