mike
mike

Reputation: 109

Rails Optimize query and loop through large entity

I have a method that outputs the following hash format for charting.

# Monthly (Jan - Dec)
{
  "john": [1,2,3,4,5,6,7,8,9,10,11,12],
  "mike": [1,2,3,4,5,6,7,8,9,10,11,12],
  "rick": [1,2,3,4,5,6,7,8,9,10,11,12]
}
# the indices represents the month
# e.g [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]
# Index
# 0 = Jan
# 1 = Feb
# 2 = Mar
...

The following method loops through all the store invoices within given year with specific sales rep name and generate above outcome

def chart_data
  hash = Hash.new {|h,k| h[k] = [] }

  (1..12).each do |month|
    date_range = "1/#{month}/#{date.year}".to_date.all_month
    all_reps.each do |name|
      hash[name] << store.bw_invoices.where(sales_rep_name: name, 
      purchase_date: date_range).sum(:subtotal).to_f
    end
  end
  return hash
end

When I run run this method it takes over 4~5 sec to execute. I really need to optimize this query. I came up with two solutions that I think it would help but I would love to get some of your expertise.

  1. move it to background job
  2. perform a SQL query to optimize(I need help with this if this is optimal)

Thank you so much for your time

Upvotes: 1

Views: 2082

Answers (1)

Adamantish
Adamantish

Reputation: 1999

Yes, you've found a problem that is very hard to solve efficiently without letting the database do the hard work.

Assuming your dataset is potentially too large to load a whole year raw into ruby objects, this approach using just 1 postgreSQL query would be probably the best kind of idea:

More SQL approach

def chart_data
  result = Hash.new {|h,k| h[k] = [] }

  total_lines = store.bw_invoices.select("sales_rep_name, to_char(purchase_date, 'mm') as month, sum(subtotal) as total")
                                 .where(purchase_date: Date.today.all_year)
                                 .group("sales_rep_name, to_char(purchase_date, 'mm')")                                 

  total_lines.each do |total_line|
    result[total_line.sales_rep_name][total_line.month.to_i - 1] = total_line.total.to_f
  end

  result
end

Note that this solution will leave nil rather than 0 for months where a rep had no sales. And if their last month with sales was June then there will only be 6 items in the array.

We can avoid this either with more complex SQL left joining from a virtual table or by filling in the array gaps afterwards. However, depending on how you setup your charting this might make no practical difference anyway.

More ruby approach

def chart_data
  result = Hash.new {|h,k| h[k] = [] }

  (1..12).each do |month|
    date_range = "1/#{month}/#{Date.today.year}".to_date.all_month
    rows = store.bw_invoices.select("sales_rep_name, SUM(subtotal) as total")
                            .where(purchase_date: date_range)
                            .group(:sales_rep_name)

    all_reps.each do |rep_name|
      row = rows.detect { |x| x.sales_rep_name == rep_name }
      result[rep_name] << (row ? row.total : 0).to_f
    end
  end

  result
end

This is more similar to your approach but takes the querying outside of the inner loop so we do 12 queries instead of 12 * number of reps. The detect used may become a little slow but only if there are thousands of reps. In which case you could sort both all_reps and the query output and implement your own kind of merge join but at that point you're getting into complexity you might as well let the database handle again.

Upvotes: 1

Related Questions