Reputation: 11882
I have a simple query need: Find a list of users who made an order since Jan 1, 2013.
In SQL, it's a very simple query.
But I'm using Rails and Active Record.
So I wrote: User.joins(:orders).where("orders.created_at >= '2013-01-01 00:00:00'")
In our database, we have 100 orders made since 01/01/2013 by 75 users. (Some users made more than one order apparently.)
However, the expression above returns 100 users. (There must be duplicates.)
I tried User.joins(:orders).where("orders.created_at >= '2013-01-01 00:00:00'").uniq
That doesn't work either.
How can I get the 75 users who've made an order since 01/01/2013?
Upvotes: 53
Views: 37550
Reputation: 1
Thanks for posting this question- it helped me in a similar situation! Adding some context for why the extra rows are showing up in case someone else comes across this in the future and is curious:
The root behavior that's happening here is that the JOIN creates a pseudo-table with a row for each distinct user
-order
combination and then performs filtering for the where
query.
In other words, if User 1 has placed 0 orders, User 1 will not be present in the pseudo-table. If User 2 has placed 1 order, but it was created prior to 2023-01-01
, there will be one row in the pseudo-table, but it will be filtered out of the results returned by the query because it doesn't satisfy the where
clause. If User 3 has placed 1 order since 01-01-2013, then there will be a single row in the pseudo table and that single row will be returned in the results.
(The case happening in the original post that is causing extra rows) If User 4 has placed 3 orders and all were created after 01-01-2013, then the pseudo-table will contain 3 rows for the User (one for User 4-Order 1, one for User 4-Order 2, and one for User 4-Order 3); all of those rows will be returned, meaning you will get 3 rows for User 4, because each result row will also contain joined Order data.
If your main objective is to filter out Users who haven't placed an order since 01-01-2013, then you don't care about the individual orders, so you can roll up the results and just select one per User. That's where the use of .distinct
is coming in, since this is more performant in SQL, rather than taking all User+Order rows and turning them into an array of objects, then filtering (via uniq
, group_by
, or something similar).
Upvotes: 0
Reputation: 861
You can write nested query like this:
User.where(id: User.joins(:orders).where("orders.created_at >= '2013-01-01 00:00:00'").ids)
Upvotes: 2
Reputation: 11882
Rails has added uniq
since version 3.2.1
so now you can use uniq
http://apidock.com/rails/ActiveRecord/QueryMethods/uniq
Upvotes: -4
Reputation: 11706
User.joins(:orders).
where("orders.created_at >= '2013-01-01 00:00:00'").
group('users.id')
group
method will chain to the query and give you a list of unique records.
Upvotes: 20
Reputation: 986
@dbjohn has the right idea, but I assume you want to avoid creating extra objects. Here's a slight variant on his solution, letting the database do the uniq-ing for you:
date = "2013-01-01 00:00:00"
User.joins(:orders).where("orders.created_at >= ?", date).distinct
Note that you can rearrange the order of methods to fit whatever you think is most semantic, and ActiveRecord will write the same SQL for you.
Upvotes: 83