Reputation: 2502
What's the best way to handle a large result set with Rails and Postgres? I didn't have a problem until today, but now I'm trying to return a 124,000 record object of @network_hosts
, which has effectively DoS'd my development server.
My activerecord orm isn't the prettiest, but I'm pretty sure cleaning it up isn't going to help in relation to performance.
@network_hosts = []
@host_count = 0
@company.locations.each do |l|
if l.grace_enabled == nil || l.grace_enabled == false
l.network_hosts.each do |h|
@host_count += 1
@network_hosts.push(h)
@network_hosts.sort! { |x,y| x.ip_address <=> y.ip_address }
@network_hosts = @network_hosts.first(5)
end
end
end
In the end, I need to be able to return @network_hosts
to the controller for processing into the view.
Is this something that Sidekiq would be able to help with, or is it going to be just as long? If Sidekiq is the path to take, how do I handle not having the @network_hosts
object upon page load since the job is running asyncronously?
Upvotes: 0
Views: 923
Reputation: 11035
Something to remember, Rails won't execute any SQL queries until the result of the query is actually needed. (I'll be using User instead of NetworkHost so I can show you the console output as I go)
@users = User.where(first_name: 'Random');nil # No query run
=> nil
@users # query is now run because the results are needed (they are being output to the IRB window)
# User Load (0.4ms) SELECT "users".* FROM "users" WHERE "users"."first_name" = $1 LIMIT $2 [["first_name", "Random"], ["LIMIT", 11]]
# => #<ActiveRecord::Relation [...]>
@users = User.where(first_name: 'Random') # query will be run because the results are needed for the output into the IRB window
# User Load (0.4ms) SELECT "users".* FROM "users" WHERE "users"."first_name" = $1 LIMIT $2 [["first_name", "Random"], ["LIMIT", 11]]
# => #<ActiveRecord::Relation [...]>
Why is this important? It allows you to store the query you want to run in the instance variable and not execute it until you get to a view where you can use some of the nice methods of ActiveRecord::Batches. In particular, if you have some view (or export function, etc.) where you are iterating the @network_hosts
, you can use find_each
.
# Controller
@users = User.where(first_name: 'Random') # No query run
# view
@users.find_each(batch_size: 1) do |user|
puts "User's ID is #{user.id}"
end
# User Load (0.5ms) SELECT "users".* FROM "users" WHERE "users"."first_name" = $1 ORDER BY "users"."id" ASC LIMIT $2 [["first_name", "Random"], ["LIMIT", 1]]
# User's ID is 1
# User Load (0.4ms) SELECT "users".* FROM "users" WHERE "users"."first_name" = $1 AND ("users"."id" > 1) ORDER BY "users"."id" ASC LIMIT $2 [["first_name", "Random"], ["LIMIT", 1]]
# User's ID is 2
# User Load (0.3ms) SELECT "users".* FROM "users" WHERE "users"."first_name" = $1 AND ("users"."id" > 2) ORDER BY "users"."id" ASC LIMIT $2 [["first_name", "Random"], ["LIMIT", 1]]
# => nil
Your query is not executed until the view, where it will now load only 1,000 records (configurable) into memory at a time. Once it reaches the end of those 1,000 records, it will automatically run another query to fetch the next 1,000 records. So your memory is much more sane, at the cost of extra database queries (which are usually pretty quick)
Upvotes: 1
Reputation: 20263
I believe you want to (1) get rid of all that looping (you've got a lot of queries going on) and (2) do your sorting with your AR query instead of in the array.
Perhaps something like:
NetworkHost.
where(location: Location.where.not(grace_enabed: true).where(company: @company)).
order(ip_address: :asc).
tap do |network_hosts|
@network_hosts = network_hosts.limit(5)
@host_count = network_hosts.count
end
Something like that ought to do it in a single DB query.
I had to make some assumptions about how your associations are set up and that you're looking for locations where grace_enabled isn't true (nil or false).
I haven't tested this, so it may well be buggy. But, I think the direction is correct.
Upvotes: 1