Reputation: 947
I need help optimizing a series of queries in a Rails 5 app
. The following explains what I am doing, but if it isn't clear let me know and I will try to go into better detail.
I have the following methods in my models:
In my IncomeReport
model:
class IncomeReport < ApplicationRecord
def self.net_incomes_2015_totals_collection
all.map(&:net_incomes_2015).compact
end
def net_incomes_2015
(incomes) - producer.expenses_2015
end
def incomes
total_yield * 1.15
end
end
In my Producer
model I have the following:
class Producer < ApplicationRecord
def expenses_2015
expenses.sum(&:expense_per_ha)
end
end
In the Expense
model I have:
class Expense < ApplicationRecord
def expense_per_ha
total_cost / area
end
end
In the controller I have this (I am using a gem called descriptive_statistics to get min, max, quartiles, etc in case you are wondering about that part at the end)
@income_reports_2015 = IncomeReport.net_incomes_2015_totals_collection.extend(DescriptiveStatistics)
Then in my view I use
<%= @income_reports_2015.descriptive_statistics[:min] %>
This code works when there are only a few objects in the database. However, now that there are thousands the query takes forever to give a result. It takes so long that it times out!
How can I optimize this to get the most performant outcome?
Upvotes: 1
Views: 562
Reputation: 81
It looks like you have a few n+1 queries here. Each report grabs its producer in an an individual query. Then, each producer grabs each of its expenses in a different query.
You could avoid the first issue by throwing a preload(:producer)
instead of the all
. However, the sum
s later will be harder to avoid since sum
will automatically fire a query.
You can avoid that issue with something like
def self.net_incomes_2015_totals_collection
joins(producer: :expenses).
select(:id, 'income_reports.total_yield * 1.15 - SUM(expenses.total_cost/expenses.area) AS net_incomes_2015').
group(:id).
map(&:net_incomes_2015).
compact
end
to get everything in one query.
Upvotes: 1
Reputation: 3251
What's the structure of IncomeReport
? By looking at the code your problem lies in all
from net_incomes_2015_totals_collection
. all
hits the database and returns all records then you map them. Overkill. Try to filter the data, query less, select less and get all the info you want directly with ActiveRecord. Ruby loops slows things down.
So, without know the table structure and its data, I'd do the following:
def self.net_incomes_2015_totals_collection
where(created_at: 2015_start_of_year..2015_end_of_year).where.not(net_incomes_2015: nil).pluck(:net_incomes_2015)
end
Plus I'd make sure there's a composide index for created_at and net_incomes_2015.
It will probably be slow but better than it is now. You should think about aggregating the data in the background (resque, sidekiq, etc) at midnight (and cache it?).
Hope it helps.
Upvotes: 1
Reputation: 2137
One approach might be to architecture your application differently. I think a service-oriented architecture might be of use in this circumstance.
Instead of querying when the user goes to this view, you might want to use a worker to query intermittently, then write to a CSV. Thus, a user navigates to this view and you could read from the CSV instead. This would run much faster because instead of doing a query then & there(when the user navigates to this page) you're simply reading from a file that was created before as a background process.
Obviously, this has its own set of challenges, but I've done this in the past to solve a similar problem. I wrote an app that fetched data from 10 different external API's once a minute. The 10 different fetches resulted in 10 objects in the db. 10 * 60 * 24 = 14,400 records in the DB per day. When a user would load the page requiring this data, they would load 7 days worth of records, 100,800 database rows. I ran into the same problem where the query being done at runtime resulted in a timeout, I wrote to a CSV and read it as a workaround.
Upvotes: 1