enrm
enrm

Reputation: 705

SQLAlchemy get the last rows since certain value

Im looking for a way to get all rows since last a certain value was added to the database. For example:

+--------+--------+--------+ 
| created| status |   my_id|
+--------+--------+--------+
| ...    | success|       3|
| ...    | fail   |       3|
| ....   | success|       3|
| ....   | fail   |       3|
| ....   | fail   |       3|
+--------|--------+--------+

Given the following table, my query would extract the last two rows, e.g. "give me the last rows since I last saw a success status"

Corner cases: If there's no success case, then all rows should be returned. If the last row is a success, then return nothing or None or something.

I don't know where to start with this.. I cant order by date since that would mess up the ordering. Doing this in one query would be the most elegant, something like

#Get the last success ?
subq = (db.session.query(MyObject.status, MyObject.created, MyObject.my_id).group_by(my_id).filter_by(status=="success").order_by(created.desc)
 .subquery())
(db.session.query(MyObject).join(
    subq, #Last success
    and_(MyObject.created >= sub.c.created)
    ).filter_by(status=="fail").all())

EDIT: created is of the following form: created = Column(types.DateTime, nullable=False, default=datetime.utcnow)

Here's an excerpt from the database table showing some values:

id  created             status  my_id
1   2020-11-26 16:09:03 failed  3
2   2020-11-26 16:12:03 failed  3
3   2020-11-26 16:15:03 failed  3
4   2020-11-26 16:18:04 failed  3
5   2020-11-26 16:21:04 failed  3
6   2020-11-26 16:24:04 failed  3
7   2020-11-26 16:27:04 failed  3
8   2020-11-26 16:30:05 failed  3
9   2020-11-26 16:33:00 failed  3
10  2020-11-26 16:36:00 failed  3
11  2020-11-26 16:39:01 failed  3
12  2020-11-26 16:42:01 failed  3
13  2020-11-26 16:45:01 failed  3
14  2020-11-26 16:48:01 failed  3
15  2020-11-26 16:51:02 failed  3
16  2020-11-26 16:54:02 failed  3
17  2020-11-26 16:57:02 failed  3
18  2020-11-26 17:00:08 failed  3

EDIT2:

I've managed a query for when we are certain there's a success in there somewhere:

subq = db.session.query(MyObj).filter_by(status="success", my_id=3). order_by (MyObj.id.desc()).limit(1).subquery("t2"))
q = MyObj.query.join(subq, and_(MyObj.status=="failed", MyObj.id > subq.c.id)).all()

This does not solve the issue of "all are fails" however..

Upvotes: 0

Views: 395

Answers (1)

snakecharmerb
snakecharmerb

Reputation: 55610

I think something like this should work

subq = (session.query(sa.func.coalesce(sa.func.max(MyObject.created), dt.datetime.min))
               .filter_by(status='success').scalar())

q = (session.query(MyObject)
            .filter(MyObject.created >= subq)
            .filter(MyObject.status == 'failed'))

In the subquery, we return the greatest "success" date using MAX(created). If there isn't a "success" row then COALESCE supplies datetime.datetime.min instead. We return a scalar so that we can compare the value in a filter.

Upvotes: 1

Related Questions