Reputation: 16067
I've read that "if you are using loops in sql, you're probably doing it wrong", which led me here. Before the question though, some background on the scenario. I have the following database structure:
Think of GlobalSubmissions as a table that contains the ideal values that ideally, users should submit. UserSubmissions is a table that contains the values submitted by users.
I have written a function that estimates a user's reliability by comparing a user's "submission"s to a single, specific "globalSubmission":
CREATE FUNCTION dbo.GetUserReliabilityForSubmission(@userID int, @submissionID int)
RETURNS float
AS
BEGIN
DECLARE @userAverageValue float,
@idealValue float;
SET @userAverageValue = (
SELECT AVG (Value)
FROM UserSubmissions
WHERE (UserID = @userID AND SubmissionID = @submissionID));
SET @idealValue = (
SELECT IdealValue
FROM Submission
WHERE (SubmissionID = @submissionID));
RETURN 1 - ABS(@userAverageValue - @idealValue);
END
This works, but it calculates the reliability of a user based on only one specific submission ID. If I wanted to calculate the "global" reliability of the user, I need to use a loop to go through ALL distinct SubmissionID a user has ever submitted and run the procedure on it.
Is there a good alternative to using a loop in this case?
Upvotes: 1
Views: 152
Reputation: 125660
SELECT d.UserID, 1 - AVG(ABS(d.avg - d.IdealValue))
FROM (
SELECT us.UserID, gs.SubmissionID, gs.IdealValue, AVG(us.Value) as avg FROM UsersSubmissions us
JOIN GlobalSubmissions gs ON gs.SubmissionID = us.SubmissionID
GROUP BY us.UserID, gs.SubmissionID, gs.IdealValue) d
GROUP BY d.UserId
Working example: http://sqlfiddle.com/#!6/8d880/8
However, I would say it's not great way of defining reliability. I think you should consider changing it to something like that:
SELECT d.UserID, 1 - AVG(ABS(d.avg - d.IdealValue) / ABS(d.IdealValue))
FROM (
SELECT us.UserID, gs.SubmissionID, gs.IdealValue, AVG(us.Value) as avg FROM UsersSubmissions us
JOIN GlobalSubmissions gs ON gs.SubmissionID = us.SubmissionID
GROUP BY us.UserID, gs.SubmissionID, gs.IdealValue) d
GROUP BY d.UserId
And and example: http://sqlfiddle.com/#!6/8d880/14
What is the change? It takes the mistake relatively to the value itself. There is a difference between saying x = 5, when it's actually 6 and saying x = 500 when it's 501.
Upvotes: 2