Roman Pushkin
Roman Pushkin

Reputation: 6079

Rails ActiveRecord raw sql read data without loading everything into memory (without pagination)

.NET has SqlDataReader:

using(SqlDataReader rdr = cmd.ExecuteReader())
{
    while (rdr.Read())
    {
        var myString = rdr.GetString(0);
        // ...
    }
}

I need to use raw sql with Rails and ActiveRecord (existing project, can't add any gems).

What I can do now is just this:

ActiveRecord::Base.connection.exec_query(query)

Which works, but for my select ... loads all the rows into memory. I would really like to iterate through results without loading it into memory like I can do it in .NET

Is it possible with Ruby/Rails/ActiveRecord?

I'm aware of pagination, just wonder if there is another way?

Upvotes: 2

Views: 1621

Answers (2)

ulferts
ulferts

Reputation: 2242

It probably depends on the use case. If you want to stick to the rails way using find_each as already proposed is sound advise.

As your question suggests that you are interested in lower level modifications and speed I would like to propose another option as instantiating ActiveRecord models can have quite an overhead.

It wouldn't be hard to write a wrapper around

ActiveRecord::Base.connection.exec_query(query)

that would behave similar to the SqlDataReader you mentioned:

class SqlDataReader
  attr_accessor :sql,
                :batch_size,
                :max_records

  def initialize(sql, batch_size, max_records = 100000)
    self.sql = sql
    self.batch_size = batch_size
    self.max_records = max_records
  end

  # takes a block that is yielded with each record fetched
  def read
    offset = 0

    # Fetch the next batch of records from the db.
    # Have an additional safeguard to not run into an infinite loop.
    # One might consider altering the safeguard to max db reads to be even safer
    while !(results = ActiveRecord::Base.connection.exec_query(query(offset))).empty? &&
          offset < max_records do

      records = results.to_hash

      offset += records.length

      # Iterate through the records.
      # Does not have to use #to_hash, could also be e.g. #rows
      results.to_hash.each do |record|
        yield record
      end
    end
  end

  # granted, this is dirty. There are probably better ways.
  def query(offset)
    sql + " LIMIT #{batch_size} OFFSET #{offset}"
  end
end

This could then be used like:

reader = SqlDataReader.new("SELECT ...", 100)

reader.read do |record|
  # do something
end

Upvotes: 1

NM Pennypacker
NM Pennypacker

Reputation: 6942

You might consider using find_each. The default batch size is 1000, but you can pass the option to be whatever value you like. https://apidock.com/rails/ActiveRecord/Batches/ClassMethods/find_each

Another method that you might find useful if you don't want to instantiate every object is pluck https://apidock.com/rails/ActiveRecord/Calculations/pluck

Upvotes: 2

Related Questions