PiBer2
PiBer2

Reputation: 169

SQLAlchemy distinct() filter on one column

I´m trying to get distinct rows of a certain column (and some other columns as well) filtering by a value in that same column. As the other columns in the query have unique values, I´m getting all the rows.

This is the subquery and query. I´m using this method because the join() -which is done on c_id column) is much more efficient than including it in the main query. My table has more than 2M rows and growing.

search_string = <some_str_value>
per_page = <some_int_value>

eventsQ1d = db.session.query(Events.p_id, Events.timestamp_event, Events.c_id).distinct(Events.p_id)\
                                    .filter(Events.p_id.contains(search_string))\
                                    .order_by(Events.timestamp_event.desc())\
                                    .limit(per_page).subquery()

eventsQd = db.session.query(eventsQ1d, Devices.name).join(Devices).all()

Given this Events table:

p_id   timestamp_event  c_id
------ ---------------  -----
ABC    2022-01-15 12:34 001
ABA    2022-02-03 11:21 002
ABC    2022-02-01 10:01 002
XYZ    2022-01-22 09:02 001

when I search by "AB" I would like to get 2 rows (ABC and ABA, plus the other columns in Events and Devices tables) but instead I´m getting the 3 rows that contain the search string.

Basically I don´t know how to make the distinct() clause work only on p_id column.

Upvotes: 0

Views: 2159

Answers (1)

PiBer2
PiBer2

Reputation: 169

This gets the job done, and with great performance (less than 2 seconds in the +2M rows table):

eventsQ1d = db.session.query(Events.p_id, Events.timestamp_event, Events.c_id)\
                                    .filter(Events.p_id.contains(search_string))\
                                    .order_by(Events.timestamp_event.desc())\
                                    .limit(per_page).subquery()

eventsQd = db.session.query(eventsQ1d, Devices.name).group_by(eventsQ1.c.p_id).join(Devices).all()

The grouping is done at the join operation. No distinct() required.

Upvotes: 0

Related Questions