Reputation: 913
TLDR: Is there a way to define a scope such that I can manipulate the records found by a query that uses that scope before it is returned? Can I use the data returned by the query to prepopulate arbitrary values on the collection of records, much like rails can "preload" association data?
Basically I've got a database table holding hierarchical information, so each row has a parent, and there are lots of times I have to chain my way up and down the hierarchy to get a nodes parents or children. To improve performance we're making heavy use of Postgresql's WITH RECURSIVE query which let's us quickly grab all decedents of a given set of node ids. On my actual model, I have two key methods which use this kind of query: an instance method descendants
and a scope find_with_all_descendants(*ids)
. However, if I have a collection of these models and I want to loop through and get the descendants for each by calling descendants
I end up generating a query for each record. So my current code looks like this
collection = Node.find_with_all_descendants(1,2,3,4)
# collection gets passed around to other parts of the program ...
collection.each do |node|
# other parts of the program do stuff with node.descendants, resulting in
# a select N+1 issue as the query for descendants fires
node.descendants
end
What would be great would be if I could call Node.find_with_all_descendants(*ids)
and then prepoluate the descendants collection, so subsequent calls to descendants
for any of the returned records hits the cached data, rather than resulting in another query. So my Node.descendants
method might look like this.
def descendants
return @cached_descendants if @cached_descendants
# otherwise execute big sql statement I'm not including
end
Then I just need to find a place where I can set @cached_descendants
for records being returned by queries that are using find_with_all_descendants
But given this is a scope, and all I can return is an active record association, I'm not clear how I could go about setting this cached value. Is there any sort of hook where I can run code after any queries that use my find_with_all_descendants
scope return their records?
UPDATE: Including the relevant methods by request. Also including some monkey patching magic we use to load the depth and path of the nodes for completeness sake.
scope :find_with_all_descendants, -> (*ids) do
tree_sql = <<-SQL
WITH RECURSIVE search_tree(id, path, depth) AS (
SELECT id, ARRAY[id], 1
FROM #{table_name}
WHERE #{table_name}.id IN(#{ids.join(', ')})
UNION ALL
SELECT #{table_name}.id, path || #{table_name}.id, depth + 1
FROM search_tree
JOIN #{table_name} ON #{table_name}.parent_id = search_tree.id
WHERE NOT #{table_name}.id = ANY(path)
)
SELECT id, depth, path FROM search_tree ORDER BY path
SQL
if ids.any?
rel = select("*")
.joins("JOIN (#{tree_sql}) tree ON tree.id = #{table_name}.id")
.send(:extend, NodeRelationMethods)
else
Node.none
end
end
def descendants
self.class.find_with_all_descendants(self.id).where.not(id: self.id)
end
# This defines the methods we're going to monkey patch into the relation returned by
# find_with_all_descendants so that we can get the path and the depth of nodes
module NodeRelationMethods
# All nodes found by original ids will have a depth of 1
# depth is accessible by calling node.depth
def with_depth
# Because rails is a magical fairy unicorn, just adding this select statement
# automatically adds the depth attribute to the data nodes returned by this
# scope
select("tree.depth as depth")
end
def with_path
# Because rails is a magical fairy unicorn, just adding this select statement
# automatically adds the path attribute to the data nodes returned by this
# scope
self.select("tree.path as path")
end
end
Upvotes: 3
Views: 965
Reputation: 21
This is very far past the point you needed it, but I'm faced with a very similar problem, and I'm wondering if the recursive query gem was looked at, or if it was available at the time, and if it would suit your needs in this case? I'm hoping not to monkey-patch a core class, nor ideally to override a method in ActiveRecord, but this seems to be a solid DSL-style extension to address what I should think is a decently common issue:
https://github.com/take-five/activerecord-hierarchical_query
Upvotes: 0
Reputation: 913
It looks like this can be done by overriding http://apidock.com/rails/v3.2.3/ActiveRecord/Relation/exec_queries. Here's some sample code boiled down to the bare esseentials
scope :find_with_all_descendants, -> (*ids) do
#load all your records here...
where(#...).extend(IncludeDescendants)
end
module IncludeDescendants
def exec_queries
records = super
records.each do |r|
#pre-populate/manipulate records here before returning
end
end
end
Basically rails calls Relation#exec_queries right before the records are returned. By extending the relation we return in our scope, we can override exec_queries. In the overriden method, we get the original method results, manipulate them further and then return
Upvotes: 1
Reputation: 3842
If you add a path[1]
to the select, you should be able to use Ruby's group_by
(not AR group
, which is for SQL GROUP BY
) to group the selected records by the top-level parent ID. I've written an example of this below, with some refactoring of the scope to take advantage of chained scopes:
def self.all_descendants
tree_sql = <<-SQL
WITH RECURSIVE search_tree(id, path, depth) AS (
SELECT id, ARRAY[id], 1
FROM (#{where("1=1").to_sql}) tmp
UNION ALL
SELECT #{table_name}.id, path || #{table_name}.id, depth + 1
FROM search_tree
JOIN #{table_name} ON #{table_name}.parent_id = search_tree.id
WHERE NOT (#{table_name}.id = ANY(path))
)
SELECT id, depth, path FROM search_tree ORDER BY path
SQL
unscoped.select("*, tree.depth as depth, tree.path as path, tree.path[1] AS top_parent_id")
.joins("JOIN (#{tree_sql}) tree ON tree.id = #{table_name}.id")
end
def descendants
self.class.where(id: id).all_descendants.where.not(id: id)
end
This way, you can do the following:
collection = Node.where(id: [1,2,3,4]).all_descendants
collection.group_by(&:top_parent_id).each do |top_parent_id, descendant_group|
top_parent = descendant_group.detect{|n| n.id == top_parent_id}
top_parent_descendants = descendant_group - top_parent
# do stuff with top_parent_descendants
end
Upvotes: 1