Ahmad Abbasi
Ahmad Abbasi

Reputation: 1776

How to get Distinct values in SQL Query

I have a problem in my SQL Query

i want to select DISTINCT StudentGroups(SG) but the query gives me some repetations

here is my Query

SELECT      DISTINCT(SG.SGID), en.EnrollmentID, CR.Name AS Course, INS.Name as Instructor,
                S.Session, SG.StartTime, SG.EndTime, EN.CreateDate

    FROM        StudentGroups SG inner JOIN Enrollments EN ON SG.SGID = EN.SGID
                JOIN Courses CR ON SG.CourseID = CR.CourseID
                JOIN Class CL ON SG.ClassID = CL.ClassID
                JOIN Instructors INS ON SG.InstructorID = INS.InstructorID
                JOIN Sessions S ON SG.SessionID = S.SessionID

    WHERE       EN.SGID NOT IN ( SELECT SGID FROM Enrollments
                            WHERE StudentID = 45

UPDATED

This query gives me following data

enter image description here

but i don't want repeated SGID

Upvotes: 11

Views: 53126

Answers (2)

Geek
Geek

Reputation: 1530

Please use GROUP BY clause when DISTINCT is not working for your desired output.

Upvotes: 6

user1919238
user1919238

Reputation:

DISTINCT always applies to all columns returned. Putting parentheses around one column makes no difference in its behavior.

If you want your results to contain only unique values of SG.GID, you can use a GROUP BY clause instead--but then you have to decide some rules for which values you want to return in the other columns. You do this by using aggregate functions like MIN(),MAX(),COUNT(),SUM(), etc. Simplified example:

SELECT  SG.SGID,
        MIN(SG.START_TIME),              --the lowest start time for this sgid.
        COUNT(DISTINCT en.EnrollmentID)  --the unique enrollments for this sgid.
    FROM StudentGroups SG 
    INNER JOIN Enrollments EN ON SG.SGID = EN.SGID
    GROUP BY SG.SGID;

When joining multiple tables as in your original query, you have to be careful when counting and summing things, as duplicates from the join can give you incorrect results.

Another option would be to use ROW_NUMBER() to return one row for each SGID:

SELECT * FROM (
    SELECT  SG.SGID,
            SG.START_TIME, 
            en.EnrollmentID,
            ROW_NUMBER() OVER (PARTITION BY SGID ORDER BY SG.START_TIME) as RN
        FROM StudentGroups SG 
        INNER JOIN Enrollments EN ON SG.SGID = EN.SGID
    )
    WHERE RN = 1;

This numbers the rows for each SGID, starting at 1 and sorted by the value of SG.START_TIME. It will return the one row with the earliest start time for each SGID. If multiple rows have the same start time, it will select any one of those, more or less at random. You could add more fields to the ORDER BY clause to further define which rows are returned.

Upvotes: 16

Related Questions