IanVS
IanVS

Reputation: 3775

Avoid Circular Dependency Between Three Tables

I'm designing a database to house scientific test data, using sqlalchemy. I've hit a problem that I can't seem to figure out.

In my test data, each Observation has a State (position, velocity, acceleration), and a State has an associated Time (time at which the state applies). So far, so good. I made a separate table for Times because I deal with different kinds of times, and I wanted to use a reference table to indicate what kind of time each time is (state time, observation time, etc). And the types of times I deal with might change, so normalizing in this way I think will let me add new kinds of times in the future, since they're just rows in a reference table.

So far this part works (using declarative style):

class Observation(Base):
    __tablename__ = 'tbl_observations'
    id = Column(Integer, primary_key=True)
    state_id = Column(Integer, ForeignKey('tbl_states.id'))
    state = relationship('State', uselist=False)

class State(Base):
    __tablename__ = 'tbl_states'
    id = Column(Integer, primary_key=True)
    time_id = Column(Integer, ForeignKey('tbl_times.id'))
    time = relationship('Time', uselist=False)

class Time(Base):
    __tablename__ = 'tbl_times'
    id = Column(Integer, primary_key=True)
    time_type_id = Column(Integer, ForeignKey('ref_tbl_time_types.id'))
    time_type = relationship('TimeType', uselist=False)
    time_value = Column(Float)

class TimeType(Base):
    __tablename__ = 'ref_tbl_time_types'
    id = Column(Integer, primary_key=True)
    desc = Column(String)

The wrinkle is that observations themselves can have different kinds of times. When I try to create a one-to-many relationship between Observation and Time, I get a circular dependency error:

class Observation(Base):
    __tablename__ = 'tbl_observations'
    id = Column(Integer, primary_key=True)
    state_id = Column(Integer, ForeignKey('tbl_states.id'))
    state = relationship('State', uselist=False)

    # Added this line:
    times = relationship('Time')

class Time(Base):
    __tablename__ = 'tbl_times'
    id = Column(Integer, primary_key=True)
    time_type_id = Column(Integer, ForeignKey('ref_tbl_time_types.id'))
    time_type = relationship('TimeType', uselist=False)
    time_value = Column(Float)

    # Added this line:
    observation_id = Column(Integer, ForeignKey('tbl_observations.id'))

I'm guessing this breaks because the original Observation -> State -> Time chain has a reference right back up to Observation.

Is there any way to fix this? Have I gotten my design all screwed up? Am I doing something wrong in sqlalchemy? I'm new to all of this so it could be any of the above. Any help you can give would be very much appreciated.

P.S. I tried doing what was recommended here: Trying to avoid a circular reference but either I did it wrong or it didn't solve my particular problem.

Upvotes: 1

Views: 1659

Answers (3)

zzzeek
zzzeek

Reputation: 75137

The other answers here regarding reconsideration of your use case are valuable, and you should consider those. However, as far as SQLAlchemy is concerned, the circular dependency issue due to multiple FKs is solved by the use_alter/post_update combination, documented at http://docs.sqlalchemy.org/en/rel_0_7/orm/relationships.html#rows-that-point-to-themselves-mutually-dependent-rows . Here is the model using that:

from sqlalchemy import *
from sqlalchemy.orm import *
from sqlalchemy.ext.declarative import declarative_base

Base= declarative_base()

class Observation(Base):
    __tablename__ = 'tbl_observations'
    id = Column(Integer, primary_key=True)
    state_id = Column(Integer, ForeignKey('tbl_states.id'))
    state = relationship('State', uselist=False)

    times = relationship('Time')

class State(Base):
    __tablename__ = 'tbl_states'
    id = Column(Integer, primary_key=True)
    time_id = Column(Integer, ForeignKey('tbl_times.id'))

    # post_update is preferable on the many-to-one
    # only to reduce the number of UPDATE statements
    # versus it being on a one-to-many.
    # It can be on Observation.times just as easily.
    time = relationship('Time', post_update=True)

class Time(Base):
    __tablename__ = 'tbl_times'
    id = Column(Integer, primary_key=True)
    time_type_id = Column(Integer, ForeignKey('ref_tbl_time_types.id'))
    time_type = relationship('TimeType', uselist=False)
    time_value = Column(Float)

    observation_id = Column(Integer, ForeignKey('tbl_observations.id', 
                                    use_alter=True, name="fk_time_obs_id"))

class TimeType(Base):
    __tablename__ = 'ref_tbl_time_types'
    id = Column(Integer, primary_key=True)
    desc = Column(String)


e = create_engine("postgresql://scott:tiger@localhost/test", echo=True)
Base.metadata.drop_all(e)
Base.metadata.create_all(e)

s = Session(e)

tt1 = TimeType(desc="some time type")
t1, t2, t3, t4, t5 = Time(time_type=tt1, time_value=40), \
                Time(time_type=tt1, time_value=50), \
                Time(time_type=tt1, time_value=60),\
                Time(time_type=tt1, time_value=70),\
                Time(time_type=tt1, time_value=80)

s.add_all([
    Observation(state=State(time=t1), times=[t1, t2]),
    Observation(state=State(time=t2), times=[t1, t3, t4]),
    Observation(state=State(time=t2), times=[t2, t3, t4, t5]),
])

s.commit()

Upvotes: 4

van
van

Reputation: 76992

I guess I do not completely get the model names in your object model and how they correspond to the real world. But I will try to guess. First, I doubt that the model Time (which looks to be rather basic and almost logic-free) should have a ForeignKey to some higher-level model class Observation. In light of this, I see your model not as a chain of n-1 relationships, but rather a kind of ternary relationship. So I could see you model like following:

class Base(object):
    id = Column(Integer, primary_key=True)

class Observation(Base):
    __tablename__ = 'tbl_observations'

class ObservationInstance(Base):
    __tablename__ = 'tbl_observation_instances'
    observation_id = Column(Integer, ForeignKey('tbl_observations.id'))
    state_id = Column(Integer, ForeignKey('tbl_states.id'))
    time_id = Column(Integer, ForeignKey('tbl_times.id'))

    # relationships
    observation = relationship('Observation', backref="instances")
    state = relationship('State')
    time = relationship('Time')

class State(Base):
    __tablename__ = 'tbl_states'

class Time(Base):
    __tablename__ = 'tbl_times'
    time_type_id = Column(Integer, ForeignKey('ref_tbl_time_types.id'))
    time_type = relationship('TimeType', uselist=False)
    time_value = Column(Float)

class TimeType(Base):
    __tablename__ = 'ref_tbl_time_types'
    desc = Column(String)

Hope this makes any sense, and fits the real world you are trying to model. I assumed that you model represents some kind of (scientific) experiment. In this case I would rename Observation -> Experiement and ObservationInstance -> Observation.

Upvotes: 0

Caleb
Caleb

Reputation: 1088

You have a many to one relationship between Observations and States. So one State can have many Observations, and every Observation has one State.

You also have a many to one relationship between States and Times. So one Time can have many States, and every State has one Time.

You are correct in that the problem is the reference back to observations from Times. You are forcing each Time to have an Observation, which in turn has to have a State, which in turn has to have a Time (and then the loop repeats forever).

To break this you need to figure out what you are actually trying to portray in these relationships. If an Observation has a State, which has a Time, then the Observation has a Time (you can get the Time from the State).

So the real question you need to answer is: What does it mean to say that a Time has an Observation? how would you be using that in your application?

Upvotes: 2

Related Questions