Reputation: 4997
I've got a model, Prospect
. I want to sort prospects by some value (e.g. updated_at ASC
), however I also want to bisect the sorted list so that all prospects where lead_id IS NOT NULL
are shown first, sorted by updated_at ASC
, then all prospects where lead_id IS NULL
are shown last sorted by updated_at ASC
. I don't care what lead_id
is, only that any records where lead_id IS NOT NULL
are bumped to the beginning of the list. I need to be able to paginate the dataset and the query needs to be relatively efficient (turnaround time should be well under 500ms, ideally under 100ms).
First Approach
I first tried to accomplish this using the following query, however this doesn't work (for my use case) because prospects are sorted by lead_id (as you would expect from this query), which is unique, therefore making a secondary sort effectively useless.
Prospect.order("lead_id ASC nulls last, updated_at ASC")
Second Approach
I tried a (slightly modified) approach suggested by Andrey Deineko. This returns the entire dataset in the correct order, but there is no way to merge the two separate relations into a single relation that can be paginated. As a result, in order to paginate the dataset, I need to instantiate every row from the table into memory. That would be acceptable for a few dozen records at most, but certainly not 20k+.
# prospect.rb
scope :with_leads, -> { where.not(lead_id: nil) }
scope :without_leads, -> { where(lead_id: nil) }
scope :ordered, -> { order(:total_clicks_count) }
[Prospect.with_leads.ordered, Prospect.without_leads.ordered].flatten
Third Approach
I realized that I can get a sorted list of prospect ids (both with and without a lead_id) and use that to get the full dataset ordered by id. This accomplishes what I need and works fine for a few dozen or hundred records, but isn't viable for 20k+ records.
lead_ids = Prospect.where.not(lead_id: nil).pluck(:id)
prospect_ids = Prospect.where(lead_id: nil).pluck(:id)
prospects = Prospect.order_by_ids([lead_ids, prospect_ids].flatten)
Here is the source for order_by_ids
:
class ApplicationRecord < ActiveRecord::Base
self.abstract_class = true
def self.order_by_ids(ids)
# https://stackoverflow.com/a/26777669/4139179
order_by = ["CASE"]
ids.each_with_index do |id, index|
order_by << "WHEN #{self.name.underscore}s.id='#{id}' THEN #{index}"
end
order_by << "END"
order(order_by.join(" "))
end
end
The problem with this second approach is that it takes up to 1 second per 1000 records to build the list or ordered ids for each subset (prospects with and without a lead_id) and then use that to fetch the entire dataset in order.
Is there a better approach that will return the entire dataset (in a way that can be paginated) ordered by some attribute (e.g. updated_at ASC
) where prospects with a lead_id are at the top of the list and those without are at the bottom?
Upvotes: 1
Views: 2295
Reputation: 52357
When using ORDER BY
"column" in PostgreSQL, NULL
values will come last by default. So
Prospect.order(:lead_id, :updated_at)
should do the trick.
Your actual need:
The effect in practice would be that when a user views a list of their sales prospects, those that have been converted to leads will show up first in the list, even if the list is sorted by other attributes.
# prospect.rb
scope :with_leads, -> { where.not(lead_id: nil) }
scope :without_leads, -> { where(lead_id: nil) }
scope :ordered, -> { order(:total_clicks_count) }
And then use these scopes to present to the user:
Prospect.with_leads.ordered.each do
#...
end
Prospect.without_leads.ordered.each do
#...
end
Upvotes: 1