AndrewSwerlick
AndrewSwerlick

Reputation: 913

Manipulating records returned in an active record scope

TLDR: Is there a way to define a scope such that I can manipulate the records found by a query that uses that scope before it is returned? Can I use the data returned by the query to prepopulate arbitrary values on the collection of records, much like rails can "preload" association data?

Basically I've got a database table holding hierarchical information, so each row has a parent, and there are lots of times I have to chain my way up and down the hierarchy to get a nodes parents or children. To improve performance we're making heavy use of Postgresql's WITH RECURSIVE query which let's us quickly grab all decedents of a given set of node ids. On my actual model, I have two key methods which use this kind of query: an instance method descendants and a scope find_with_all_descendants(*ids). However, if I have a collection of these models and I want to loop through and get the descendants for each by calling descendants I end up generating a query for each record. So my current code looks like this

collection = Node.find_with_all_descendants(1,2,3,4)

# collection gets passed around to other parts of the program ...

collection.each do |node|
  # other parts of the program do stuff with node.descendants, resulting in    
  # a select N+1 issue as the query for descendants fires

  node.descendants
end

What would be great would be if I could call Node.find_with_all_descendants(*ids) and then prepoluate the descendants collection, so subsequent calls to descendants for any of the returned records hits the cached data, rather than resulting in another query. So my Node.descendants method might look like this.

def descendants
  return @cached_descendants if @cached_descendants
  # otherwise execute big sql statement I'm not including 
end

Then I just need to find a place where I can set @cached_descendants for records being returned by queries that are using find_with_all_descendants

But given this is a scope, and all I can return is an active record association, I'm not clear how I could go about setting this cached value. Is there any sort of hook where I can run code after any queries that use my find_with_all_descendants scope return their records?

UPDATE: Including the relevant methods by request. Also including some monkey patching magic we use to load the depth and path of the nodes for completeness sake.

scope :find_with_all_descendants, -> (*ids) do
  tree_sql =  <<-SQL
    WITH RECURSIVE search_tree(id, path, depth) AS (
        SELECT id, ARRAY[id], 1
        FROM #{table_name}
        WHERE #{table_name}.id IN(#{ids.join(', ')})
      UNION ALL
        SELECT #{table_name}.id, path || #{table_name}.id, depth + 1
        FROM search_tree
        JOIN #{table_name} ON #{table_name}.parent_id = search_tree.id
        WHERE NOT #{table_name}.id = ANY(path)
    )
    SELECT id, depth, path FROM search_tree ORDER BY path
  SQL

  if ids.any?
    rel = select("*")
      .joins("JOIN (#{tree_sql}) tree ON tree.id = #{table_name}.id")
      .send(:extend, NodeRelationMethods)
  else
    Node.none
  end
end

def descendants
  self.class.find_with_all_descendants(self.id).where.not(id: self.id)
end

# This defines the methods we're going to monkey patch into the relation returned by
# find_with_all_descendants so that we can get the path and the depth of nodes
module NodeRelationMethods
  # All nodes found by original ids will have a depth of 1
  # depth is accessible by calling node.depth
  def with_depth
    # Because rails is a magical fairy unicorn, just adding this select statement
    # automatically adds the depth attribute to the data nodes returned by this
    # scope
    select("tree.depth as depth")
  end

  def with_path
    # Because rails is a magical fairy unicorn, just adding this select statement
    # automatically adds the path attribute to the data nodes returned by this
    # scope
    self.select("tree.path as path")
  end
end

Upvotes: 3

Views: 965

Answers (3)

CoderCanuck
CoderCanuck

Reputation: 21

This is very far past the point you needed it, but I'm faced with a very similar problem, and I'm wondering if the recursive query gem was looked at, or if it was available at the time, and if it would suit your needs in this case? I'm hoping not to monkey-patch a core class, nor ideally to override a method in ActiveRecord, but this seems to be a solid DSL-style extension to address what I should think is a decently common issue:

https://github.com/take-five/activerecord-hierarchical_query

Upvotes: 0

AndrewSwerlick
AndrewSwerlick

Reputation: 913

It looks like this can be done by overriding http://apidock.com/rails/v3.2.3/ActiveRecord/Relation/exec_queries. Here's some sample code boiled down to the bare esseentials

scope :find_with_all_descendants, -> (*ids) do
  #load all your records here...
  where(#...).extend(IncludeDescendants)
end

module IncludeDescendants

  def exec_queries
    records = super
    records.each do |r|
      #pre-populate/manipulate records here before returning
    end
  end 

end

Basically rails calls Relation#exec_queries right before the records are returned. By extending the relation we return in our scope, we can override exec_queries. In the overriden method, we get the original method results, manipulate them further and then return

Upvotes: 1

eirikir
eirikir

Reputation: 3842

If you add a path[1] to the select, you should be able to use Ruby's group_by (not AR group, which is for SQL GROUP BY) to group the selected records by the top-level parent ID. I've written an example of this below, with some refactoring of the scope to take advantage of chained scopes:

def self.all_descendants
  tree_sql =  <<-SQL
    WITH RECURSIVE search_tree(id, path, depth) AS (
        SELECT id, ARRAY[id], 1
        FROM (#{where("1=1").to_sql}) tmp
      UNION ALL
        SELECT #{table_name}.id, path || #{table_name}.id, depth + 1
        FROM search_tree
        JOIN #{table_name} ON #{table_name}.parent_id = search_tree.id
        WHERE NOT (#{table_name}.id = ANY(path))
    )
    SELECT id, depth, path FROM search_tree ORDER BY path
  SQL
  unscoped.select("*, tree.depth as depth, tree.path as path, tree.path[1] AS top_parent_id")
    .joins("JOIN (#{tree_sql}) tree ON tree.id = #{table_name}.id")
end

def descendants
  self.class.where(id: id).all_descendants.where.not(id: id)
end

This way, you can do the following:

collection = Node.where(id: [1,2,3,4]).all_descendants
collection.group_by(&:top_parent_id).each do |top_parent_id, descendant_group|
  top_parent = descendant_group.detect{|n| n.id == top_parent_id}
  top_parent_descendants = descendant_group - top_parent
  # do stuff with top_parent_descendants
end

Upvotes: 1

Related Questions