Reputation: 18418
Is there a way to retrieve a subset of elements from a relationship in SqlAlchemy but keeping the order I defined in the backref
(relationship) when the class was created?
Let's say I have two tables/models: A User
and a UserLog
which contains actions performed by the user (the Base
is just a declarative_base())
Each UserLog
has a ForeignKey to the ID of the User
that performed the action, and the type of action for each log can be chosen from an Enum
. Each UserLog also has a created
timestamp I order with: On a regular scenario, I'd like to get the logs from oldest to newest (oldest first).
However, I'd also like to be able to get a particular user with only a subset of Logs (logs of a particular action) loaded. I understand this goes agains the "gist" of the joined load, since I would be getting an unrealistic state of the database (there are more logs linked to the user that my fetched User
object would reflect), but it would be helpful in certain occasions.
class UserLog(Base):
__tablename__ = 'user_logs'
id = Column(UUID(as_uuid=True), primary_key=True)
timestamp_created = Column(
DateTime, server_default=sqlalchemy.text("now()")
)
user_id = Column(
UUID(as_uuid=True),
ForeignKey("users.id", ondelete="CASCADE")
)
action = Column(SaEnum(UserLogsConstants), nullable=False)
user = relationship(
"User",
backref=backref(
'logs', order_by="asc(UserLog.timestamp_created)"
) # The order_by is important here
)
Thanks to this SO answer I've been able to load a subset of logs using contains_eager
, but then I lose the ordering by timestamp_created
that I stated in the backref
.
Let's say I have two available actions for the logs: UPLOAD_START
, UPLOAD_END
(for the action
column)
I've set up a test in which I create a User
with 5 UPLOAD_START
actions and 5 UPLOAD_END
actions created out of order (I'm artificially inserting "newer" actions first).
So what I'd like is getting a User
with its .logs
relationship containing only logs whose event is UPLOAD_START
and keep the order by timestamp specified in the backref
object.
Here's the test query
u = session.query(User) \
.filter(User.id == test_user_id) \
.join(User.logs) \
.filter(UserLog.action == UserLogsConstants.UPLOAD_START) \
.options(contains_eager(User.logs)) \
.one()
These two asserts work fine:
assert len(u.logs) == 5
# I inserted 10 logs total, only 5 with action=UPLOAD_START
assert all(ul.action == UserLogsConstants.UPLOAD_START for ul in u.logs)
# Neat, only logs with `UPLOAD_START`
But this one fails: the ordering is not respected.
assert all(
u.logs[i].timestamp_created < u.logs[i + 1].timestamp_created
for i in range(len(u.logs) - 1)
)
That makes sense. If I understand the contains_eager
behavior, this is kind of a forget what you think the relationship is, and use what I'm telling you in the preceding query (and I'm not saying anything about the order in that query). And, indeeed: I'm looking at the SQL generated by SqlAlchemy and there's no ORDER BY
clause. I suppose I could always add it myself to the query, but if would be cleaner if I didn't need to.
For those familiar with Django, I'm kind of trying to emulate the Prefech
object, where I can specify a subquery to get the loaded objects from:
User.objects.prefetch_related(
Prefetch(
'logs',
queryset=UserLogs.objects.filter(action=UserLogsConstants.UPLOAD_START)
)
).get(pk='foo-uuid')
Upvotes: 0
Views: 2100
Reputation: 1122022
When using contains_eager
, you tell SQLAlchemy to ignore the configured relationship and instead fill the .logs
container from your hand-crafted query. That means any ordering set on the relationship is also ignored.
So if you need the log data to be ordered, you need to do so yourself:
u = (
session.query(User)
.filter(User.id == test_user_id)
.join(User.logs)
.filter(UserLog.action == UserLogsConstants.UPLOAD_START)
.order_by(User.id, asc(UserLog.timestamp_created))
.options(contains_eager(User.logs))
.one()
)
If you are more commonly filtering user logs, then another option is to make logs
a dynamic relationship. This replaces the container with a pre-populated Query
object you'd filter on:
user = relationship(
"User",
backref=backref(
'logs', lazy='dynamic', order_by="asc(UserLog.timestamp_created)"
)
)
and use
u = session.query(User).filter(User.id == test_user_id).one()
upload_start_logs = u.logs.filter(UserLog.action == UserLogsConstants.UPLOAD_START).all()
Of course, this does issue a separate query.
Upvotes: 5