diasks2
diasks2

Reputation: 2142

Finding a faster ActiveRecord query (Ruby on Rails)

I have a model that is sorted in a particular order. My goal is to find a record from the model where the sum of a particular column of all previous records equals a certain number. The following example gets me what I need, but it is very slow, especially with a rather large table. Are there any faster ways to solve for the product.id where the sum of all previous products' points = 100000?

 total_points = 0
 find_point_level = 100000
 @products = Product.order("id").all
 @products.each do |product|
    total_points = product.points + total_points
    @find_product = product.id
    break if total_points >= find_point_level
 end

Update

Here are some times for a few of the solutions below. This is going through about 60,000 records. Times are for ActiveRecord.

Original example (above):
2685.0ms
1238.8ms
1428.0ms

Original example using find_each:
799.6ms
799.4ms
797.8ms

Creating a new column with the sums:
181.3ms
170.7ms
172.2ms

Upvotes: 5

Views: 2189

Answers (4)

Patrick Oscity
Patrick Oscity

Reputation: 54684

It turns out, there actually is a way to do this in SQL. First, let's set up some testing environment:

rails new foobar
cd foobar
rails g model Product name:string points:integer
rake db:migrate
rails console

In the Rails console, feed the DB with some records:

Product.new(name: 'Foo',  points: 1).save!
Product.new(name: 'Bar',  points: 2).save!
Product.new(name: 'Baz',  points: 3).save!
Product.new(name: 'Baf',  points: 4).save!
Product.new(name: 'Quux', points: 5).save!

Now i found a way of getting running totals in SQL in this post here. It works like this:

query = <<-SQL
  SELECT *, (
    SELECT SUM(points)
    FROM products
    WHERE id <= p.id
  ) AS total_points
  FROM products p
SQL

Running this query against the test DB gives us:

Product.find_by_sql(query).each do |p|
  puts p.name.ljust(5) + p.points.to_s.rjust(2) + p.total_points.to_s.rjust(3)
end

# Foo   1  1
# Bar   2  3
# Baz   3  6
# Baf   4 10
# Quux  5 15

So we can now use a HAVING clause (and a GROUP BY because this is needed for HAVING)to fetch only the products that match the condition and LIMIT the number of results to one:

query = <<-SQL
  SELECT *, (
    SELECT SUM(points)
    FROM products
    WHERE id <= p.id
  ) AS total_points
  FROM products p
  GROUP BY p.id
  HAVING total_points >= #{find_point_level}
  LIMIT 1
SQL

I'm really curious how this performs in your environment with many many records. Give it a try and tell me if it works for you, if you like.

Upvotes: 1

m_x
m_x

Reputation: 12564

  • this does not really solve the problem, but you can use find_each instead of each to load products in batches instead of loading all the table. see the guides

EDIT ignore the following, i forgot that window functions do not permit WHERE and HAVING clauses

  • if you are willing to use a non db-agnostic solution, you can use this (not tested):

    query = <<-SQL
      SELECT id, SUM(points) OVER (ORDER BY id) AS total_points
      FROM products
      HAVING total_points >= 100000
      LIMIT 1
    SQL
    
    @product = Product.find_all_by_sql( query )
    

this uses window functions that are NOT supported by all RDBMS (Postgresql does). Beware, once you have retrieved the @product, it will be a readonly record with only two attributes accessible: id and total_points

Upvotes: 0

Alexander Randa
Alexander Randa

Reputation: 868

If table is very big, you can use plain sql query:

find_point_level = 100000
Product.find_all_by_sql("SELECT SUM(points) FROM (SELECT points FROM products ORDER BY id LIMIT #{find_point_level}) AS subquery")

Also for column index should be exist index in the database.

Upvotes: -2

samuil
samuil

Reputation: 5081

You can try denormalizing your database, and keeping partial sums directly in products table. Simple query with where and limit would return you a proper answer in no time.

You need to create additional filter, that will update single record whenever product is added, and all products whenever product is deleted or it's points field were changed.

Upvotes: 6

Related Questions