jwchang
jwchang

Reputation: 10864

MongoDB Index and ShardKey for range queries?

I will use MongoDB 2.4.x

It means that I can use Hashed-based index.

I can set the index as shardKey and it will be distributed among servers almost evenly and will be accessed evenly.

The problem arises when I try to do range queries.

My query would be like below

 db.feeds.find({ age: { $gte: 20, $lte: 25}}).sort({timestamp: -1}).limit(10)

I think I have two options

  1. index and set shardKey on { age: 1, timestamp: -1 }

  2. use hashed-based index for shardKey and the above index for query and cache recent query results in memory (memcached or redis) because range query will hit many sharded servers.

Which one would be much efficient strategy to handle range queries? Or any other suggestions on this problem?

Upvotes: 3

Views: 176

Answers (1)

Philipp
Philipp

Reputation: 69663

Whether or not caching makes sense depends on how different your range queries are, how much data they return and how often the cache needs to be invalidated.

Adding another database technology like Redis or Memcached would add additional technical complexity to your project. It would require more know-how and man-hours to maintain the product and it would create another point of failure. So when it is possible to do in an adequate manner, you should try to do the caching in MongoDB.

You could implement the cache as another (capped?) collection in MongoDB, where the hashed shard key consists of the range delimeters of the query used for the result.

A document in this cache-collection would then look like that:

 { 
     age_range: {
         from: 20,
         to: 25
     },
     results: [
         ...
     ]
 }

and your index like this:

ensureIndex( {
         age_range:"hashed"
     },
     {
         unique:1
     }
);

Keep in mind that this could become problematic when you have extremely large result sets, because the maximum document size in MongoDB is limited to 16MB.

Upvotes: 1

Related Questions