Reputation: 403
See query below
Select count(*) FROM
(Select distinct Student_ID, Name, Student_Age, CourseID from student) a1
JOIN
(Select distinct CourseID, CourseName, TeacherID from courses) a2
ON a1.CourseID=a2.CourseID
JOIN
(Select distinct TeacherID, TeacherName, Teacher_Age from teachers) a3
ON a2.TeacherID=a3.TeacherID
The subqueries must be used for deduping purpose.
This query run fine in PostgreSQL. However, if I add a condition between the student and teacher table, according to the execution plan, Postgres will wrongly nested loop join the student and teach tables which have no direct relationship. For example:
Select count(*) FROM
(Select distinct Student_ID, Name, Student_Age, CourseID from student) a1
JOIN
(Select distinct CourseID, CourseName, TeacherID from courses) a2
ON a1.CourseID=a2.CourseID
JOIN
(Select distinct TeacherID, TeacherName, Teacher_Age from teachers) a3 ON
a2.TeacherID=a3.TeacherID
WHERE Teacher_Age>=Student_Age
This query will take forever to run. However, if I replace the subqueries with the tables, it'll run very fast. Without using temp tables to store the deduping result, is there a way to to avoid the nested loop in this situation?
Thank you for your help.
Upvotes: 3
Views: 9108
Reputation: 1277
You're making the database perform a lot of unnecessary work to accomplish your goal. Instead of doing 3 different SELECT DISTINCT sub-queries all joined together, try joining the base tables directly to each other and let it handle the DISTINCT part only once. If your tables have proper indexes on the ID fields, this should run rather quick.
SELECT COUNT(1)
FROM (
SELECT DISTINCT s.Student_ID, c.CourseID, t.TeacherID
FROM student s
JOIN courses c ON s.CourseID = c.CourseID
JOIN teachers t ON c.TeacherID = t.TeacherID
WHERE t.Teacher_Age >= s.StudentAge
) a
Upvotes: 2