Buster Mills
Buster Mills

Reputation: 33

mongos / intelligent routing

Suppose collection "coll" has an index

{ts : 1, X : 1 , Y : 1}

Whre ts, X and Y are type NumberLong.

The collection is configured for sharding on ts,X

Could you help me understand how the following queries will execute?

1) Unbounded range: will the following query be targetted at those shards hosting ranges ts > 100000000 only or is this a global query?

db.coll.find({ts : {$gt : 100000000}}) 

2) Bounded range: if so, how about this one - will this be targeted or global? Is mongos clever enough to parse out the query?

db.coll.find({$and : [{ts : {$gt : 100000000}}, {ts : {$lte : 110000000}}]})

3) Finally -- what happens w/ multiple bounded ranges:

db.coll.find({$or : [[{$and : [{ts : {$gt : 100000000}}, {ts : {$lte : 110000000}}]}, {$and : [{ts : {$gt : 500000000}}, {ts : {$lte : 510000000}}]}]]})

I am unable to find any reference to range queries on http://www.mongodb.org/display/DOCS/Sharding+Introduction...!

Thanks in advance!

Upvotes: 3

Views: 237

Answers (1)

Remon van Vliet
Remon van Vliet

Reputation: 18625

  1. Will hit only those shards that contain documents that can possibly comply to your range criteria. Mongos is able to determine this because your querying on the shard key and thus knows which are eligible. Queries to invidual shards are executed in parallel.
  2. Same as 1). Mongos is intelligent enough to know which chunks to hit for this.
  3. Each clause in the $or will be evaluated seperately. If clauses resolve to the exact same query plan and chunk range they will/should be combined. You can check this by running an explain() on your query in the sharded environment and keep an eye on the returned information. If it contains a "clauses" object it means not all clauses in your $or can use the same query execution plan.

That's the theory but mongos query plans tend to be a little inconsistent. For example with a "ts" shard key and one chunk at each of two shards holding range minkey-50 and the other 51-maxkey respectively, this query

{$or:[{$and:[{ts:{$gt:90}}, {ts:{$lt:100}}]}, {$and:[{ts:{$gt:91}}, {ts:{$lt:100}}]}]}

would correctly resolve into a single query to the second chunk. But this one (notice the 90 instead of 91 in the second clause)

{$or:[{$and:[{ts:{$gt:90}}, {ts:{$lt:100}}]}, {$and:[{ts:{$gt:90}}, {ts:{$lt:100}}]}]}

would actually result in a query to both shards which makes very little sense since you're basically asking it to $or two clauses that are exactly the same. Basically, try to use explain() and other monitoring tools to see how your queries behave in a sharding environment to be sure it works as intended.

Upvotes: 2

Related Questions