Derrick Mar
Derrick Mar

Reputation: 1021

Rails - Active Record: Find all records which have a count on has_many association with certain attributes

A user has many identities.

class User < ActiveRecord::Base
    has_many :identities
end

class Identity < ActiveRecord::Base
    belongs_to :user
end

An identity has an a confirmed:boolean column. I'd like to query all users that have an only ONE identity. This identity must also be confirmed false.

I've tried this

User.joins(:identities).group("users.id").having( 'count(user_id) = 1').where(identities: { confirmed: false })

But this returns users with one identity confirmed:false but they could also have additional identities if they are confirmed true. I only want users with only one identity confirmed:false and no additional identities that are have confirmed attribute as true.

I've also tried this but obviously it's slow and I'm looking for the right SQL to just do this in one query.

  def self.new_users
    users = User.joins(:identities).where(identities: { confirmed: false })
    users.select { |user| user.identities.count == 1 }
  end

Apologies upfront if this was already answered but I could not find a similar post.

Upvotes: 4

Views: 3368

Answers (3)

Derrick Mar
Derrick Mar

Reputation: 1021

One solution is to use rails nested queries

User.joins(:identities).where(id: Identity.select(:user_id).unconfirmed).group("users.id").having( 'count(user_id) = 1')

And here's the SQL generated by the query

SELECT "users".* FROM "users"
INNER JOIN "identities" ON "identities"."user_id" = "users"."id"
WHERE "users"."id" IN (SELECT "identities"."user_id" FROM "identities"  WHERE "identities"."confirmed" = 'f')
GROUP BY users.id HAVING count(user_id) = 1

I still don't think this is the most efficient way. While I'm able to generate only one SQL query (meaning only one network call to the db), I'm still have to do two scans: one scan on the USERS table and one scan on the IDENTITIES table. This can be optimized by indexing the identities.confirmed column but this still doesn't solve the two full scans problem.

For those who understand the query plan here it is:

     QUERY PLAN
-------------------------------------------------------------------------------------------
 HashAggregate  (cost=32.96..33.09 rows=10 width=3149)
   Filter: (count(identities.user_id) = 1)
   ->  Hash Semi Join  (cost=21.59..32.91 rows=10 width=3149)
         Hash Cond: (identities.user_id = identities_1.user_id)
         ->  Hash Join  (cost=10.45..21.61 rows=20 width=3149)
               Hash Cond: (identities.user_id = users.id)
               ->  Seq Scan on identities  (cost=0.00..10.70 rows=70 width=4)
               ->  Hash  (cost=10.20..10.20 rows=20 width=3145)
                     ->  Seq Scan on users  (cost=0.00..10.20 rows=20 width=3145)
         ->  Hash  (cost=10.70..10.70 rows=35 width=4)
               ->  Seq Scan on identities identities_1  (cost=0.00..10.70 rows=35 width=4)
                     Filter: (NOT confirmed)
(12 rows)

Upvotes: 1

Max Williams
Max Williams

Reputation: 32933

I think group_concat may be the answer here, if you have the function in your DBMS. (if not there may be an equivalent). This will collect all the values for the field from the group into a comma-separated string. We want ones where this string is equal to "false": ie, there's just one, and it's false (which i think is your requirement, it's a little unclear). . I think this should work if we let Rails handle the translation of false into however the DB stores it.

User.joins(:identities).group("identities.user_id").having("group_concat(identities.confirmed) = ?", false)

EDIT - if your database stores false as 0 then the above will generate sql like having group_concat(identities.confirmed) = 0. Because the result of the group_concat is a string, then it may (in some DBMS's) do a string-to-integer cast on the results before comparing it to 0, which will return lots of false positives if all the other strings cast to 0. In that case you can try this:

User.joins(:identities).group("identities.user_id").having("group_concat(identities.confirmed) = '?'", false)

(note quotes around ?)

EDIT2 - postgres version.

I've not tried this but it looks like recent versions of postgres have a function array_agg() which does the same as mysql's group_concat(). Because postgres stores true/false as 't'/'f' we shouldn't need to wrap the ? in quotes. Try this:

User.joins(:identities).group("identities.user_id").having("array_agg(identities.confirmed) = ?", false)

Upvotes: 0

Nermin
Nermin

Reputation: 6100

  def self.new_users
    joins(:identities).group("identities.user_id").having("count(identities.user_id) = 1").where(identities: {confirmed: false}).uniq
  end

Upvotes: 0

Related Questions