Jon Snow
Jon Snow

Reputation: 11882

How to make Active Record join return unique objects?

I have a simple query need: Find a list of users who made an order since Jan 1, 2013.

In SQL, it's a very simple query.

But I'm using Rails and Active Record.

So I wrote: User.joins(:orders).where("orders.created_at >= '2013-01-01 00:00:00'")

In our database, we have 100 orders made since 01/01/2013 by 75 users. (Some users made more than one order apparently.)

However, the expression above returns 100 users. (There must be duplicates.)

I tried User.joins(:orders).where("orders.created_at >= '2013-01-01 00:00:00'").uniq

That doesn't work either.

How can I get the 75 users who've made an order since 01/01/2013?

Upvotes: 53

Views: 37550

Answers (5)

BIGred053
BIGred053

Reputation: 1

Thanks for posting this question- it helped me in a similar situation! Adding some context for why the extra rows are showing up in case someone else comes across this in the future and is curious:

The root behavior that's happening here is that the JOIN creates a pseudo-table with a row for each distinct user-order combination and then performs filtering for the where query.

In other words, if User 1 has placed 0 orders, User 1 will not be present in the pseudo-table. If User 2 has placed 1 order, but it was created prior to 2023-01-01, there will be one row in the pseudo-table, but it will be filtered out of the results returned by the query because it doesn't satisfy the where clause. If User 3 has placed 1 order since 01-01-2013, then there will be a single row in the pseudo table and that single row will be returned in the results.

(The case happening in the original post that is causing extra rows) If User 4 has placed 3 orders and all were created after 01-01-2013, then the pseudo-table will contain 3 rows for the User (one for User 4-Order 1, one for User 4-Order 2, and one for User 4-Order 3); all of those rows will be returned, meaning you will get 3 rows for User 4, because each result row will also contain joined Order data.

If your main objective is to filter out Users who haven't placed an order since 01-01-2013, then you don't care about the individual orders, so you can roll up the results and just select one per User. That's where the use of .distinct is coming in, since this is more performant in SQL, rather than taking all User+Order rows and turning them into an array of objects, then filtering (via uniq, group_by, or something similar).

Upvotes: 0

pa3k
pa3k

Reputation: 861

You can write nested query like this:

User.where(id: User.joins(:orders).where("orders.created_at >= '2013-01-01 00:00:00'").ids)

Upvotes: 2

Jon Snow
Jon Snow

Reputation: 11882

Rails has added uniq since version 3.2.1 so now you can use uniq

http://apidock.com/rails/ActiveRecord/QueryMethods/uniq

Upvotes: -4

konyak
konyak

Reputation: 11706

User.joins(:orders).
    where("orders.created_at >= '2013-01-01 00:00:00'").
    group('users.id')

group method will chain to the query and give you a list of unique records.

Upvotes: 20

amcaplan
amcaplan

Reputation: 986

@dbjohn has the right idea, but I assume you want to avoid creating extra objects. Here's a slight variant on his solution, letting the database do the uniq-ing for you:

date = "2013-01-01 00:00:00"
User.joins(:orders).where("orders.created_at >= ?", date).distinct

Note that you can rearrange the order of methods to fit whatever you think is most semantic, and ActiveRecord will write the same SQL for you.

Upvotes: 83

Related Questions