SQLAlchemy insert, update precondition (consistency check)

Question

I would like to keep an invariant when inserting into or updating an 'interval' many-to-many relation using SQLAlchemy. The goal is to make sure that there are no overlapping intervals in the table that represents the intervals for the same object (A).

For example:

class A(Base):
    __tablename__ = 'a'
    id = Column(Integer, primary_key=True)
    intervals = relationship('Interval', backref='a', cascade='all, delete-orphan')


class B(Base):
    __tablename__ = 'b'
    id = Column(Integer, primary_key=True)
    intervals = relationship('Interval', backref='b', cascade='all, delete-orphan')

class Interval(Base):
    __tablename__ = 'interval'
    id = Column(Integer, primary_key=True)
    a_id = Column(Integer, ForeignKey('a.id', ondelete='cascade'), nullable=False)
    b_id = Column(Integer, ForeignKey('b.id', ondelete='cascade'), nullable=False)
    start = Column(DateTime, nullable=False)
    end = Column(DateTime, nullable=False)

Updated based on van's notes (thanks):

So I want to ensure before/during insert and update that there is no record in the 'interval' table where

((interval.start >= start and interval.start <= end) or
(interval.end >= start and interval.end <= end) or
(start >= interval.start and start <= interval.end) or
(end >= interval.start and end <= interval.end)) and
a_id = interval.a_id

So the question would be what the best way is to achieve this with SQLAlchemy using a MySQL backend. I want to make sure this check is as atomic as possible and there is no possibility to violate it with concurrent operations. I hope the question is clear. Thanks for the help in advance.

Update 1:

Answers to van's questions:

What is your workflow in case of validation failure?

The user should receive an error message that the interval that the user tried to add overlaps an already existing interval.
The user might also receive the list of intervals that are colliding with the one the user tried to add.

What will you do with those Intervals which failed validation and were not added to a?

See the answer to the workflow question.
So basically, if validation fails when someone tries to add an overlapping interval an exception should occur and the user should receive an error message that this is not allowed.
The intervals that were not added should not be committed to the database.

note that they will still be committed to DB because they belong to b1 but have a_id = NULL (you do allow NULLs in your current model)

These shouldn't be committed and there shouldn't be NULL values for either a_id or b_id. I updated the model to reflect this.
That is it could be NULL but that does not seem to make sense to me as the records with a_id = NULL should have been cleaned up somehow then.

What should the end user see? Shall the transaction be committed at all etc? Are you happy to wrap any a.intervals.add(...) in a try/except?

The user should see an error when this constraint is violated.
The transaction should by no means be commited.
Try except wrapping is fine.

This is how I imagine adding an interval:

1.) Form validation happens and we know that other fields already failed validation (interval will not be added to the session) and we want to check if the interval also fails validation:

### Start request
# Prevalidation
Interval.check_valid(a, start, end)

...
#### End request

2.) All form fields passed validation, we want to check if interval would be valid if committed (this might not be needed) and then commit the interval:

# Start request
# Basic validation at the time of addition
try:
    interval = Interval(a, b, start, end)
except SomeException:
    return("This interval overlaps another, please fix this!")

...

# Main validation when the interval relation is committed to database
try:
    session.flush() # commit the session to the database
except AnotherException: # maybe IntegrityError or something similar
    return("This interval overlaps another, please fix this!")
### End request

I'm not sure if the Basic validation is really needed though. The Prevalidation is needed because if someone POSTs the form and we see some errors for other fields we would like to get all invalid fields listed in the response. The reason for this is that I don't want the user to POST the form several times with coming back to him again and again with newer errors, but I would like to list all the errors at once that I'm able to check.

Looking at the solution van provided it does do the Basic validation, but there might be some edge cases where this could cause problems. This is a web application so it is possible that two different users are using different sessions, so when doing i1 = Interval(a1, b1, start1, end1), i1 does not appear in other sessions until the registry is refreshed from database in that session. (At least this is what I think how it works.)

Users tries to add an interval (i1) that that does not overlap any intervals in the database. Interval(a1, b1, start1, end1)
Users tries to add an interval (i2) that that does not overlap any intervals in the database, but overlaps with i1. Interval(a1, b2, start2, end2)
Both validations might succeed as the two users might be using different sessions.
Session i1 is belonging to is flush()-ed and committed to database without errors.
Session i2 is belonging to is flush()-ed and committed to database without errors.

Is the situation listed here possible or am I misunderstanding something?

I was also thinking about adding UPDATE and INSERT triggers to the 'interval' table to do this check. I'm not sure if this is the right way to go and if that provides the atomicity that will ensure that concurrent attempts like listed above will not cause a problem. If this is the right way to go my question would be what the right way is for creating those triggers when calling Base.metedata.create_all(). Is this the only way for doing this or is there a possibility to somehow attach this to the model and let create_all() create it: DELIMITER / Creating a trigger in SQLAlchemy

Update 2:

The right algorithm for checking if the intervals collide seems to be:

interval.start <= end and start <= interval.end

I found that the right approach for ensuring the appropriate atomicity is to just use select for update in the method where the overlap is checked. This works good with MySQL at least.

SQLAlchemy insert, update precondition (consistency check)

Answers (1)

Related Questions