Mayron
Mayron

Reputation: 2394

SQL Returning duplicate column values

What I want to do: Show only unique course IDs where each course ID shows 1 student with their project name and their most up to date project log.

Example:

+----------+-------------+-----------------+------------------+
| CourseID | StudentName | ProjectName     | LatestLog        |
+----------+-------------+-----------------+------------------+
| 1        | Bob         | Bob's Project   | 01/09/2017 09:15 |
+----------+-------------+-----------------+------------------+
| 2        | Tim         | Tim's Project   | 02/10/2017 13:00 |
+----------+-------------+-----------------+------------------+
| 3        | Billy       | Billy's Project | 02/12/2017 12:45 |
+----------+-------------+-----------------+------------------+

These are my database table relationships:

For each course, I want to find the most recently submitted project log using MAX on the LogDate (DATETIME value) and the student and project name relating to that log. (EDIT: NOT every student belonging to the course, just the student name and project name who submitted the next project log for review, i.e. the log with the newest date possibly using the MAX function)

The problem with this SQL query is that it returns many duplicate Course IDs and every student and project name for each log is being returned.

I hope I have described the problem clearly. How can I fix this? Thank you!

SELECT Course.ID, -- INT
       Student.Name, -- VARCHAR(50)
       StudentProjects.Name, -- VARCHAR(50)
       MAX(ProjectLogs.LogDate) AS LogDate -- DATETIME

FROM ProjectLogs INNER JOIN
     Student ON Student.ID = ProjectLogs.StudentID INNER JOIN
     StudentProjects ON StudentProjects.ID = Student.ProjectID INNER JOIN
     Course ON Course.ID = Student.CourseID

GROUP BY Course.ID, Student.Name, StudentProjects.Name

Upvotes: 1

Views: 54

Answers (2)

Edgars T.
Edgars T.

Reputation: 1139

This is what you are looking for:

select Id as course_id,
       student_name,
       student_name,
       project_name,
       LogDate
 from (
SELECT Course.ID, -- INT
       Student.Name student_name, -- VARCHAR(50)
       StudentProjects.Name project_name, -- VARCHAR(50)
       ProjectLogs.LogDate,
       row_number()over(partition by Course.ID order by ProjectLogs.LogDate desc) rn      -- DATETIME
FROM ProjectLogs INNER JOIN
     Student ON Student.ID = ProjectLogs.ID INNER JOIN
     StudentProjects ON StudentProjects.ID = Student.ID INNER JOIN
     Course ON Course.ID = Student.ID
) q
where rn = 1;

Upvotes: 1

digital.aaron
digital.aaron

Reputation: 5707

You can use a CTE to get the max LogDate for each student, and then join that to a simplified version of your query, like this:

;WITH MaxLogDate AS
(
    SELECT StudentID
        ,LogDate = MAX(LogDate) 
    FROM ProjectLogs
    GROUP BY StudentID
)
SELECT c.ID
    ,s.Name
    ,sp.Name
    ,m.LogDate
FROM Courses c
INNER JOIN Student s ON s.ID = c.StudentID
INNER JOIN StudentProjects sp ON sp.StudentID = s.ID
INNER JOIN MaxLogDate m ON m.StudentID = s.ID

Upvotes: 1

Related Questions