Ben Flynn
Ben Flynn

Reputation: 18922

Getting random entry from Objectify entity

How can I get a random element out of a Google App Engine datastore using Objectify? Should I fetch all of an entity's keys and choose randomly from them or is there a better way?

Upvotes: 3

Views: 1011

Answers (4)

Johnny Wu
Johnny Wu

Reputation: 1528

I pretty much adapt the algorithm provided Matejc. However, 3 things:

  1. Instead of using count() or the datastore service factory (DatastoreServiceFactory.getDatastoreService()), I have an entity that keep track of the total count of the entities that I am interested in. The reason for this approach is that: a. count() could be expensive when you are dealing with a lot of objects b. You can't test the datastore service factory locally...testing in prod is just a bad practice.

  2. Generating the random number: ThreadLocalRandom.current().nextLong(1, maxRange)

  3. Instead of using limit(), I use offset, so I don't have to worry about "sorting."

Upvotes: 0

Steven Roose
Steven Roose

Reputation: 2769

Quoted from this post about selecting some random elements from an Objectified datastore:

If your ids are sequential, one way would be to randomly select 5 numbers from the id range known to be in use. Then use a query with an "in" filter().

If you don't mind the 5 entries being adjacent, you can use count(), limit(), and offset() to randomly find a block of 5 entries.

Otherwise, you'll probably need to use limit() and offset() to randomly select one entry out at a time.

-- Josh

Upvotes: 0

MatejC
MatejC

Reputation: 2237

You don't need to fetch all. For example:

  1. countall = query(X.class).count() // http://groups.google.com/group/objectify-appengine/browse_frm/thread/3678cf34bb15d34d/82298e615691d6c5?lnk=gst&q=count#82298e615691d6c5
  2. rnd = Generate random number [0..countall]
  3. ofy.query(X.class).order("- date").limit(rnd); //for example -date or some chronic indexed field
  4. Last id is your... (in average you fatch 50% or at lest first read is in average 50% less)

Improvements (to have smaller key table in cache)!

After first read remember every X elements. Cache id-s and their position. So next time query condition from selected id further (max ".limit(rnd%X)" will be X-1).

Random is just random, if it doesn't need to be close to 100% fair, speculate chronic field value (for example if you have 1000 records in 10 days, for random 501 select second element greater than fifth day).

Other options, if you have chronic field date (or similar), fetch elements older than random date and younger then random date + 1 (you need to know first date and last date). Second select random between fetched records. If query is empty select greater than etc...

Upvotes: 0

Nick Johnson
Nick Johnson

Reputation: 101149

Assign a random number between 0 and 1 to each entity when you store it. To fetch a random record, generate another random number between 0 and 1, and query for the smallest entity with a random value greater than that.

Upvotes: 2

Related Questions