user2682863
user2682863

Reputation: 3218

How do you test for unmatched tables when using outer join (Many to Many)

In sqlalchemy, how to you test for unmatched tables when using an outerjoin on a many-to-many relationship.

I'm working with 5 tables/objects.

In the below example I'm trying to find student absences. I perform an outerjoin on the Attendance table because the Attendance object doesn't always exist. I want to return rows where Attendance.absent == True or rows where Attendance is not matched (ie not yet created).

class Session(BaseModel):
    id = db.Column(db.Integer, primary_key=True)
    # list of Attendance objects for students that attended this session
    attendance = db.relationship(Attendance, backref="session") 

    @classmethod
    def get_absences(cls, student):
        # I'm trying to return a list of sessions where this student was absent
        # The query joins the Program table so it can join
        # the enrollment table, so it can filter for enrollments
        # for this student and then return sessions where 
        # the Attendance object has not been created or Attendance.absent == True
        return (
            cls.query
                .join(Program)
                .join(Enrollment)
                .filter(Enrollment.student_id == student.id)
                .outerjoin(Attendance)
                 # Attendance == None is not correct
                .filter(or_(Attendance == None, Attendance.absent == True))
        )


class Attendance(BaseModel):
    id = db.Column(db.Integer, primary_key=True)
    session_id = db.Column(db.Integer, db.ForeignKey('session.id') )
    student_id = db.Column(db.Integer, db.ForeignKey('student.id'))

So Session.get_absences() obviously doesn't work because Attendance == None isn't overloaded and doesn't generate sql. How do I return sessions where the Attendance object hasn't been created for a given Student / Session pair?

Upvotes: 3

Views: 201

Answers (2)

Erwin Brandstetter
Erwin Brandstetter

Reputation: 658032

Assuming a standard relational model with referential integrity enforced by FK constraints.

Basically, you need a query like this:

SELECT st.id, se.id, a.absent
FROM   student         st
JOIN   enrollmant      e  ON e.student_id = st.id
-- JOIN program        p  ON p.id = e.program_id  -- omit middleman
JOIN   session         se ON se.program_id = e.program_id -- p.id -- omit middleman
LEFT   JOIN attendance a  ON a.student_id = st.id
                         AND a.session_id = se.id
WHERE  st.id = 123                                -- given student ID
AND   (a.student_id IS NULL OR a.absent);         -- missing or "absent"

Returns a list of student ID and session ID where the student was either absent or no entry was found.

The full list of candidates is generated from joining to enrollment, (program), and session. The final LEFT [OUTER] JOIN to attendance determines the missing ones.

I commented out the join to program, as this is just a middleman we don't need for the query. (Unless you need to assert a matching row in that table.)

I added absent to the result to tell the difference between missing and "absent" entries.

See:

Upvotes: 4

Yaakov Bressler
Yaakov Bressler

Reputation: 12098

Modify your query to allow a full outer with Attendance -- also, Attendance.absent with no matches will return null so match your query for those results :

cls.query\
    .join(Program)\
    .join(Enrollment)\
    .filter(Enrollment.student_id == student.id)\
    .join(Attendance, isouter =True, full=True)\ 
    .filter(or_(Attendance.absent == None, Attendance.absent == True))\
    .all()

Upvotes: 2

Related Questions