Reputation: 15181
I have a Company
model that has many Disclosures
. The Disclosure
has columns named title
, pdf
and pdf_sha256
.
class Company < ActiveRecord::Base
has_many :disclosures
end
class Disclosure < ActiveRecord::Base
belongs_to :company
end
I want to make it unique by pdf_sha256
and if pdf_sha256
is nil
that should be treated as unique.
If it is an Array
, I'll write like this.
companies_with_sha256 = company.disclosures.where.not(pdf_sha256: nil).group_by(&:pdf_sha256).map do |key,values|
values.max_by{|v| v.title.length}
end
companies_without_sha256 = company.disclosures.where(pdf_sha256: nil)
companies = companies_with_sha256 + companeis_without_sha256
How can I get the same result by using ActiveRecord query?
Upvotes: 5
Views: 4948
Reputation: 71
If you need a relation with distinct pdf_sha256
, where you require no explicit conditions, you can use group for that -
scope :unique_pdf_sha256, -> { where.not(pdf_sha256: nil).group(:pdf_sha256) }
scope :nil_pdf_sha256, -> { where(pdf_sha256: nil) }
You could have used or
, but the relation passed to it must be structurally compatible. So even if you get same type of relations in these two scopes, you cannot use it with or
.
Edit: To make it structurally compatible with each other you can see @AlexSantos 's answer
Upvotes: 0
Reputation: 66
Assuming you are using Rails 5 you could chain a .or command to merge both your queries.
pdf_sha256_unique_disclosures = company.disclosures.where(pdf_sha256: nil).or(company.disclosures.where.not(pdf_sha256: nil))
Then you can proceed with your group_by logic.
However, in the example above i'm not exactly sure what is the objective but I am curious to better understand how you would use the resulting companies
variable.
If you wanted to have a hash of unique pdf_sha256 keys including nil, and its resultant unique disclosure document you could try the following:
sorted_disclosures = company.disclosures.group_by(&:pdf_sha256).each_with_object({}) do |entries, hash|
hash[entries[0]] = entries[1].max_by{|v| v.title.length}
end
This should give you a resultant hash like structure similar to the group_by where your keys are all your unique pdf_sha256 and the value would be the longest named disclosure that match that pdf_sha256.
Upvotes: 1
Reputation: 2950
It is possible to do it in one query by first getting a different id
for each different pdf_sha256
as a subquery, then in the query getting the elements within that set of ids by passing the subquery as follows:
def unique_disclosures_by_pdf_sha256(company)
subquery = company.disclosures.select('MIN(id) as id').group(:pdf_sha256)
company.disclosures.where(id: subquery)
.or(company.disclosures.where(pdf_sha256: nil))
end
The great thing about this is that ActiveRecord is lazy loaded, so the first subquery
will not be run and will be merged to the second main query to create a single query in the database. It will then retrieve all the disclosures
unique by pdf_sha256
plus all the ones that have pdf_sha256
set to nil
.
In case you are curious, given a company, the resulting query will be something like:
SELECT "disclosures".* FROM "disclosures"
WHERE (
"disclosures"."company_id" = $1 AND "disclosures"."id" IN (
SELECT MAX(id) as id FROM "disclosures" WHERE "disclosures"."company_id" = $2 GROUP BY "disclosures"."pdf_sha256"
)
OR "disclosures"."company_id" = $3 AND "disclosures"."pdf_sha256" IS NULL
)
The great thing about this solution is that the returned value is an ActiveRecord query, so it won't be loaded until you actually need. You can also use it to keep chaining queries. Example, you can select only the id
instead of the whole model and limit the number of results returned by the database:
unique_disclosures_by_pdf_sha256(company).select(:id).limit(10).each { |d| puts d }
Upvotes: 2
Reputation: 211
You can achieve this by using uniq method
Company.first.disclosures.to_a.uniq(&:pdf_sha256)
This will return you the disclosures records uniq by cloumn "pdf_sha256"
Hope this helps you! Cheers
Upvotes: 2
Reputation: 6952
Why not:
ids = Disclosure.select(:id, :pdf_sha256).distinct.map(&:id)
Disclosure.find(ids)
The id sill be distinct either way since it's the primary key, so all you have to do is map the ids and find the Disclosures by id.
Upvotes: 0
Reputation:
Model.select(:rating)
Result of this is an array of Model objects. Not plain ratings. And from uniq's point of view, they are completely different. You can use this:
Model.select(:rating).map(&:rating).uniq
or this (most efficient)
Model.uniq.pluck(:rating)
Model.distinct.pluck(:rating)
Update
Apparently, as of rails 5.0.0.1, it works only on "top level" queries, like above. Doesn't work on collection proxies ("has_many" relations, for example).
Address.distinct.pluck(:city) # => ['Moscow']
user.addresses.distinct.pluck(:city) # => ['Moscow', 'Moscow', 'Moscow']
In this case, deduplicate after the query
user.addresses.pluck(:city).uniq # => ['Moscow']
Upvotes: -1