Reputation: 169
I´m trying to get distinct rows of a certain column (and some other columns as well) filtering by a value in that same column. As the other columns in the query have unique values, I´m getting all the rows.
This is the subquery and query. I´m using this method because the join()
-which is done on c_id
column) is much more efficient than including it in the main query. My table has more than 2M rows and growing.
search_string = <some_str_value>
per_page = <some_int_value>
eventsQ1d = db.session.query(Events.p_id, Events.timestamp_event, Events.c_id).distinct(Events.p_id)\
.filter(Events.p_id.contains(search_string))\
.order_by(Events.timestamp_event.desc())\
.limit(per_page).subquery()
eventsQd = db.session.query(eventsQ1d, Devices.name).join(Devices).all()
Given this Events table:
p_id timestamp_event c_id
------ --------------- -----
ABC 2022-01-15 12:34 001
ABA 2022-02-03 11:21 002
ABC 2022-02-01 10:01 002
XYZ 2022-01-22 09:02 001
when I search by "AB" I would like to get 2 rows (ABC and ABA, plus the other columns in Events and Devices tables) but instead I´m getting the 3 rows that contain the search string.
Basically I don´t know how to make the distinct()
clause work only on p_id
column.
Upvotes: 0
Views: 2159
Reputation: 169
This gets the job done, and with great performance (less than 2 seconds in the +2M rows table):
eventsQ1d = db.session.query(Events.p_id, Events.timestamp_event, Events.c_id)\
.filter(Events.p_id.contains(search_string))\
.order_by(Events.timestamp_event.desc())\
.limit(per_page).subquery()
eventsQd = db.session.query(eventsQ1d, Devices.name).group_by(eventsQ1.c.p_id).join(Devices).all()
The grouping is done at the join operation. No distinct()
required.
Upvotes: 0