Reputation: 914
I have the query below - which works, but it takes about 80 seconds to run in MSSMS. I am wondering if it can be made more efficient?
An explanation will follow below.
SELECT A.courseid, A.studentid, attendanceStatus, lessonDate,
(SELECT top 1 SnoozeFrom FROM [tblConsecutiveSnooze] C
WHERE A.courseID = C.courseID AND A.StudentID = C.StudentID
ORDER BY SnoozeFrom DESC ) AS latestSnooze ,
(SELECT top 1 SnoozeTimes FROM [tblConsecutiveSnooze] D
WHERE A.courseID = D.courseID AND A.StudentID = D.StudentID
ORDER BY SnoozeFrom DESC ) AS snoozeTimes
FROM [tblStudentAttendance] A INNER JOIN tblcourses
ON A.courseID = tblcourses.courseID
WHERE [lessonDate] > getdate()-21
AND EXISTS (SELECT * FROM tblstudentCourses B WHERE A.courseID = B.courseID
AND B.[DateLeft] IS NULL AND A.StudentID = B.StudentID)
ORDER BY courseid , studentID, [lessonDate]
So what I am doing is trying to access all student attendance records (from tblStudentAttendance) within the last 21 days, when i have confirmed (via the EXISTS) that the student is indeed still registered to the course.
Those 2 sub SELECT queries can be combined into one, but that does not have an impact on the run time of the query.
What does seem to effect the run time greatly is the EXISTS condition. So any suggestions appreciated.
UPDATE:
Sql Plan: http://199.127.217.23/mm.sqlplan
Indexes:
Upvotes: 0
Views: 86
Reputation: 50211
If the primary way that you access tblConsecutiveSnooze
is by CourseID
and StudentID
, then I highly recommend that you change the PK to be nonclustered and add a clustered index on CourseID, StudentID
. This is far superior to just adding a nonclustered index and leaving the clustered PK on id
. Furthermore, it's possible you don't even need the id
column, if there are no FKs to it (that don't make sense to switch to CourseID, StudentID
). While I am a proponent of surrogate keys, not every table needs them!
I'd also like to recommend that you stop naming columns simply id
. A column name should be the same in every table it exists within, including its base table.
Upvotes: 0
Reputation: 308
If you look at the execution plan you posted you will find that adding the missing index could improve the query performance by up to 33%. You can try adding the following non-clustered index to the tblStudentCourses Table. You can change the index name to whatever suits you.
USE [hvps453_datab2]
GO
CREATE NONCLUSTERED INDEX [NC_tblStuddentCourses_DL_SI_CI]
ON [dbo].[tblStudentCourses] ([DateLeft])
INCLUDE ([studentID],[courseId])
GO
Upvotes: 1
Reputation: 109
How about a derived table with a row number to find the most recent SnoozeFrom record? Alternatively a CTE could also be used.
SELECT A.courseid
, A.studentid
, attendanceStatus
, lessonDate
, ConsecSnooze.SnoozeFrom AS latestSnooze
, ConsecSnooze.SnoozeTimes AS snoozeTimes
FROM [tblStudentAttendance] A
INNER JOIN tblcourses
ON A.courseID = tblcourses.courseID
LEFT JOIN (
SELECT SnoozeFrom
, SnoozeTimes
, C.courseID
, C.StudentID
, ROW_NUMBER() OVER (PARTITION BY C.CourseID, C.StudentID ORDER BY SnoozeFrom DESC) AS RowNum
FROM [tblConsecutiveSnooze] C
) as ConsecSnooze ON ConsecSnooze.courseID = A.courseID
AND ConsecSnooze.StudentID = A.studentID
AND ConsecSnooze.RowNum = 1
WHERE [lessonDate] > getdate() - 21
AND EXISTS (
SELECT *
FROM tblstudentCourses B
WHERE A.courseID = B.courseID
AND B.[DateLeft] IS NULL
AND A.StudentID = B.StudentID
)
ORDER BY courseid
, studentID
, [lessonDate]
Upvotes: 0