Mr. Demetrius Michael
Mr. Demetrius Michael

Reputation: 2406

Mongo / Ruby driver output specific number of documents at a time?

Ruby Mongo Driver question:

How do I output 5_000 document batches from the collection at a time until I read the last document in the collection without dumping the entire database into memory first?

This is really bad method for me:

mongo = MongoClient.new('localhost', 27017)['sampledb']['samplecoll']
@whois.find.to_a....

Upvotes: 0

Views: 233

Answers (1)

Gary Murakami
Gary Murakami

Reputation: 3402

Mongo::Collection#find returns a Mongo::Cursor that is Enumerable. For batch processing Enumerable#each_slice is your friend and well worth adding to your toolkit.

Hope that you like this.

find_each_slice_test.rb

require 'mongo'
require 'test/unit'

class FindEachSliceTest < Test::Unit::TestCase
  def setup
    @samplecoll = Mongo::MongoClient.new('localhost', 27017)['sampledb']['samplecoll']
    @samplecoll.remove
  end

  def test_find_each_slice
    12345.times{|i| @samplecoll.insert( { i: i } ) }
    slice__max_size = 5000
    @samplecoll.find.each_slice(slice__max_size) do |slice|
      puts "slice.size: #{slice.size}"
      assert(slice__max_size >= slice.size)
    end
  end
end

ruby find_each_slice_test.rb

Run options: 

# Running tests:

slice.size: 5000
slice.size: 5000
slice.size: 2345
.

Finished tests in 6.979301s, 0.1433 tests/s, 0.4298 assertions/s.

1 tests, 3 assertions, 0 failures, 0 errors, 0 skips

Upvotes: 1

Related Questions