Reputation: 757
SELECT TOP 1 @CurrentStudentID = StudentID
FROM Courses WITH (NOLOCK)
WHERE Courses.CourseID = @CourseID
ORDER BY StudentID
-- Loop through all the students and find if he/she is registered for more than one course.
WHILE (@@ROWCOUNT > 0 AND @CurrentStudentID IS NOT NULL)
BEGIN
-- Select all other courses student is currently registered in.
IF @@ROWCOUNT > 0
BEGIN
-- return required information
END
ELSE
BEGIN
-- Perform some operations
END
-- Select the next registered student
SELECT TOP 1 @CurrentStudentID = StudentID
FROM Courses WITH (NOLOCK)
WHERE Courses.CourseID = @CourseID AND
Courses.StudentID > @CurrentStudentID
ORDER BY StudentID
END
Can someone help with my logic here? I wrote a stored procedure to find out if a student of a course is currently taking other courses from the same school.
I'm particularly worried about the two SELECT queries and the performance of a while loop if the number of students is huge. I feel the way I am doing it feels very contrived. I'm sure there are better ways to do this.
I've done SQL profiling on this stored procedure and it's duration can range from 0 - 60 ms for a single call. I don't understand why the same stored procedure's execution time is so random and inconsistent.
Appreciate any help. I only have 1 year plus of SQL Server 2008 experience.
Thanks in advance.
Upvotes: 0
Views: 46
Reputation: 1298
AS I mentioned, SQL is a set-based theory language. In other words, it is semi-relational with data sets that allows for efficient comparisons between groups of data. "Lower" languages such as C++ or Java do not maintain such large data sets, since they are cursor (line by line) based-languages.
High level as this definition is, the point is to think of your data like EXCEL
sheets. You have predefined columns such as CourseID
and StudentID
, that have information in the other columns that are dependent on those values (CourseID
1:1 Course_Name) and some information that is repetitive (CourseID can have multiple students).
True normalization includes removing interdependent columns, but lets not worry about that right now. The main focus is on what makes sense for the business. Your table has Identifying columns for its courses and students. So you do not need to use a cursor if those values do not have conflicting interdependent values.
SELECT StudentID, COUNT(COURSEID) AS CLASS_NUM
FROM COURSES
GROUP BY StudentID
HAVING COUNT(COURSEID) > 1
The GROUP BY
returns distinct sets of values from the columns listed, flattening the other rows and allowing aggregate functions like COUNT()
. (note: NULLS
are not counted in the COUNT()
. Use an ISNULL
function)
You have not yet limited the list, and yet you achieve the same results. After SQL flattens the rows, you can use a HAVING
clause to further limit the result sets from the GROUP BY
if needed.
Way faster than a cursor, definitely. :)
Now, if your table includes students in different semesters and years, you might consider adding this to the GROUP BY
, so that you have sets in your GROUP BY
(StudentID
and Year
)
Also, recall that the SELECT
statement LOGICALLY read AFTER the GROUP BY
and HAVING
clauses, so any columns listed in the SELECT
statement must appear in the GROUP BY
or or have an aggregate function.
Upvotes: 1