Skodik.o
Skodik.o

Reputation: 514

Translate SQL statement to relational algebra

I have SQL statement:

    SELECT need.studentid, need.firstname, need.lastname, need.courseid
FROM 
     (SELECT student.studentid, student.firstname, student.lastname, enrollment.courseid 
      FROM prereq LEFT JOIN enrollment ON prereq.courseid = enrollment.courseid LEFT JOIN student ON enrollment.studentid = student.studentid) AS need

WHERE need.studentid NOT IN (SELECT studentid 
FROM enrollment
WHERE NOT(grade LIKE '%F' OR grade IS NULL OR grade LIKE 'N'))

My goal is to convert this to relational algebra. I've managed to convert Left Joins and inner select, but I don't know how to connect this with second part and how to convert NOT IN to relational algebra.

Upvotes: 0

Views: 326

Answers (1)

spencer7593
spencer7593

Reputation: 108400

If we aren't depending on the behavior we'd get if the subquery in the NOT IN (subquery) would return a NULL value for `studentid`...

(If the subquery returns a NULL, then the NOT IN comparison will never evaluate to TRUE, for any row, so the outer query wouldn't return any rows.)

It could be there's some sort of guarantee that `studentid` won't ever be NULL, like its defined to be NOT NULL or it's the PRIMARY KEY... but we don't see that. And there isn't a studentid IS NOT NULL condition in that subquery, so we have to anticipate the behavior that will occur if the subquery does return a NULL.

If we aren't interested in retaining that behavior,

then the NOT IN (subquery) could be expressed as an antijoin pattern.

SELECT need.studentid
     , need.firstname
     , need.lastname
     , need.courseid
  FROM ( SELECT student.studentid
              , student.firstname
              , student.lastname
              , enrollment.courseid 
           FROM prereq
           LEFT
           JOIN enrollment
             ON prereq.courseid = enrollment.courseid
           LEFT
           JOIN student
             ON enrollment.studentid = student.studentid
       ) need
  LEFT
  JOIN enrollment aj
    ON aj.studentid = need.studentid
   AND NOT ( aj.grade LIKE '%F' OR aj.grade IS NULL OR aj.grade LIKE 'N')
 WHERE aj.studentid IS NULL

The anti-join pattern looks like an outer join (to return all rows from one side along with matching rows from the other), and a condition in the WHERE clause that excludes all rows that had a match.

This works because at.studentid is guaranteed to be non-NULL for any matching rows that were found. For any row from need that didn't have a matching row in aj, those rows will have NULL value for the columns from aj. (That's what the outer join basically does... invents a row of NULL values to serve as a placeholder matching row for the join operation.)

For the conditional test in the WHERE clause, all we need to do is test for NULL values in one of the columns from aj that would be guaranteed to be non-NULL. We could reference the PRIMARY KEY of the table if we knew what that was. We can use the studentid column because any row that satisfies the equality comparison in the join condition is guaranteed to be non-NULL.)

The SQL antijoin can be translated into relational algebra.

An equivalent result can be returned without an inline view.

SELECT student.studentid
     , student.firstname
     , student.lastname
     , enrollment.courseid 
  FROM prereq
  LEFT
  JOIN enrollment
    ON prereq.courseid = enrollment.courseid
  LEFT
  JOIN student
    ON enrollment.studentid = student.studentid
  LEFT
  JOIN enrollment aj
    ON aj.studentid = student.studentid
   AND NOT ( aj.grade LIKE '%F' OR aj.grade IS NULL OR aj.grade LIKE 'N')
 WHERE aj.studentid IS NULL

Upvotes: 1

Related Questions