Reputation: 2406
Ruby Mongo Driver question:
How do I output 5_000 document batches from the collection at a time until I read the last document in the collection without dumping the entire database into memory first?
This is really bad method for me:
mongo = MongoClient.new('localhost', 27017)['sampledb']['samplecoll']
@whois.find.to_a....
Upvotes: 0
Views: 233
Reputation: 3402
Mongo::Collection#find returns a Mongo::Cursor that is Enumerable. For batch processing Enumerable#each_slice is your friend and well worth adding to your toolkit.
Hope that you like this.
find_each_slice_test.rb
require 'mongo'
require 'test/unit'
class FindEachSliceTest < Test::Unit::TestCase
def setup
@samplecoll = Mongo::MongoClient.new('localhost', 27017)['sampledb']['samplecoll']
@samplecoll.remove
end
def test_find_each_slice
12345.times{|i| @samplecoll.insert( { i: i } ) }
slice__max_size = 5000
@samplecoll.find.each_slice(slice__max_size) do |slice|
puts "slice.size: #{slice.size}"
assert(slice__max_size >= slice.size)
end
end
end
ruby find_each_slice_test.rb
Run options:
# Running tests:
slice.size: 5000
slice.size: 5000
slice.size: 2345
.
Finished tests in 6.979301s, 0.1433 tests/s, 0.4298 assertions/s.
1 tests, 3 assertions, 0 failures, 0 errors, 0 skips
Upvotes: 1