GTDev
GTDev

Reputation: 5528

Mongoid random document

Lets say I have a Collection of users. Is there a way of using mongoid to find n random users in the collection where it does not return the same user twice? For now lets say the user collection looks like this:

class User
  include Mongoid::Document
  field :name
end

Simple huh?

Thanks

Upvotes: 11

Views: 8908

Answers (9)

Cyril Duchon-Doris
Cyril Duchon-Doris

Reputation: 13959

MongoDB 3.2 comes to the rescue with $sample (link to doc)

EDIT : The most recent of Mongoid has implemented $sample, so you can call YourCollection.all.sample(5)

Previous versions of mongoid

Mongoid doesn't support sample until Mongoid 6, so you have to run this aggregate query with the Mongo driver :

samples = User.collection.aggregate([ { '$sample': { size: 3 } } ])
# call samples.to_a if you want to get the objects in memory

What you can do with that

I believe the functionnality should make its way soon to Mongoid, but in the meantime

module Utility
  module_function
  def sample(model, count)
    ids = model.collection.aggregate([ 
      { '$sample': { size: count } }, # Sample from the collection
      { '$project': { _id: 1} }       # Keep only ID fields
    ]).to_a.map(&:values).flatten     # Some Ruby magic

    model.find(ids)
  end
end

Utility.sample(User, 50)

Upvotes: 8

Markus Graf
Markus Graf

Reputation: 533

The approach from @moox is really interesting but I doubt that monkeypatching the whole Mongoid is a good idea here. So my approach is just to write a concern Randomizable that can included in each model you use this feature. This goes to app/models/concerns/randomizeable.rb:

module Randomizable
  extend ActiveSupport::Concern

  module ClassMethods
    def random(n = 1)
      indexes = (0..count - 1).sort_by { rand }.slice(0, n).collect!

      return skip(indexes.first).first if n == 1
      indexes.map { |index| skip(index).first }
    end
  end
end

Then your User model would look like this:

class User
  include Mongoid::Document
  include Randomizable

  field :name
end

And the tests....

require 'spec_helper'

class RandomizableCollection
  include Mongoid::Document
  include Randomizable

  field :name
end

describe RandomizableCollection do
  before do
    RandomizableCollection.create name: 'Hans Bratwurst'
    RandomizableCollection.create name: 'Werner Salami'
    RandomizableCollection.create name: 'Susi Wienerli'
  end

  it 'returns a random document' do
    srand(2)

    expect(RandomizableCollection.random(1).name).to eq 'Werner Salami'
  end

  it 'returns an array of random documents' do
    srand(1)

    expect(RandomizableCollection.random(2).map &:name).to eq ['Susi Wienerli', 'Hans Bratwurst']
  end
end

Upvotes: 0

Dan Healy
Dan Healy

Reputation: 757

The best solution is going to depend on the expected size of the collection.

For tiny collections, just get all of them and .shuffle.slice!

For small sizes of n, you can get away with something like this:

result = (0..User.count-1).sort_by{rand}.slice(0, n).collect! do |i| User.skip(i).first end

For large sizes of n, I would recommend creating a "random" column to sort by. See here for details: http://cookbook.mongodb.org/patterns/random-attribute/ https://github.com/mongodb/cookbook/blob/master/content/patterns/random-attribute.txt

Upvotes: 14

tothemario
tothemario

Reputation: 6299

If you just want one document, and don't want to define a new criteria method, you could just do this:

random_model = Model.skip(rand(Model.count)).first

If you want to find a random model based on some criteria:

criteria = Model.scoped_whatever.where(conditions) # query example
random_model = criteria.skip(rand(criteria.count)).first

Upvotes: 19

Sui Mak
Sui Mak

Reputation: 1

I think it is better to focus on randomizing the returned result set so I tried:

Model.all.to_a.shuffle

Hope this helps.

Upvotes: -2

Cyrill Dorovsky
Cyrill Dorovsky

Reputation: 73

Just encountered such a problem. Tried

Model.all.sample

and it works for me

Upvotes: 0

apneadiving
apneadiving

Reputation: 115541

Since I want to keep a criteria, I do:

scope :random, ->{
  random_field_for_ordering = fields.keys.sample
  random_direction_to_order = %w(asc desc).sample
  order_by([[random_field_for_ordering, random_direction_to_order]])
}

Upvotes: 0

Moox
Moox

Reputation: 1117

If you really want simplicity you could use this instead:

class Mongoid::Criteria

  def random(n = 1)
    indexes = (0..self.count-1).sort_by{rand}.slice(0,n).collect!

    if n == 1
      return self.skip(indexes.first).first
    else
      return indexes.map{ |index| self.skip(index).first }
    end
  end

end

module Mongoid
  module Finders

    def random(n = 1)
      criteria.random(n)
    end

  end
end

You just have to call User.random(5) and you'll get 5 random users. It'll also work with filtering, so if you want only registered users you can do User.where(:registered => true).random(5).

This will take a while for large collections so I recommend using an alternate method where you would take a random division of the count (e.g.: 25 000 to 30 000) and randomize that range.

Upvotes: 3

RameshVel
RameshVel

Reputation: 65877

You can do this by

  1. generate random offset which will further satisfy to pick the next n elements (without exceeding the limit)
  2. Assume count is 10, and the n is 5
  3. to do this check the given n is less than the total count
  4. if no set the offset to 0, and go to step 8
  5. if yes, subtract the n from the total count, and you will get a number 5
  6. Use this to find a random number, the number definitely will be from 0 to 5 (Assume 2)
  7. Use the random number 2 as offset
  8. now you can take the random 5 users by simply passing this offset and the n (5) as a limit.
  9. now you get users from 3 to 7

code

>> cnt = User.count
=> 10
>> n = 5
=> 5
>> offset = 0
=> 0
>> if n<cnt
>>    offset = rand(cnt-n)
>>  end
>> 2
>> User.skip(offset).limit(n)

and you can put this in a method

def get_random_users(n)
  offset = 0
  cnt = User.count
  if n < cnt
    offset = rand(cnt-n)
  end
  User.skip(offset).limit(n)
end

and call it like

rand_users = get_random_users(5)

hope this helps

Upvotes: 2

Related Questions