Reputation: 201
I have some very basic questions regarding shard key.
In our application, we are creating _id fields from a coarse grained attribute -country and a per country monotonically increasing sequence number eg IN_1. As per several online references and books (eg http://www.kchodorow.com/blog/2011/01/04/how-to-choose-a-shard-key-the-card-game/), it is better to have a compound shard key - coarse grained key + search key. We also have a country attribute in the shard key and have an index on that.
Most, if not all of our queries on these collections are going to be country search based or _id based.
What would be the best choice for a shard key
I am inclined to using the option 3- but is that possible - it is not very clear from MongoDB docs.
Upvotes: 1
Views: 145
Reputation: 11671
just _id - since ctry is already baked into it - will it make queries starting with ctry slow?
I don't recommend this. _id
is monotonically increasing so it will concentrate writes on a single shard. If you are also doing queries with the shape
{ "ctry" : "United States" }
then they will be broadcast to all shards, not targeted.
{ctry: 1, _id: 1} - but part of my _id is monotonically increasing sequence.
_id
is monotonically increasing but, as long as you are inserting documents with shard key values containing different values for ctry
, the shard key values are not monotonically increasing, so you will not be concentrating writes on a single shard. However, for a given country c
, all writes will go to only one of the shards containing chunks with ctry = c
.
This seems reasonable.
{ctry: 1, _id: hashed} - Promises both read locality and write distribution. Is this supported by MongoDB?
I like this best. It supports your queries, provides read isolation, and spreads out writes. Hashed shard keys in MongoDB are built on a single field, but you can compute your own hash hashed_id
and store that on the document, then shard on
{ "ctry" : 1, "hashed_id" : 1 }
just {_id: hashed} - will it make queries starting with ctry slow?
The monotonicity problem is gone but ctry
queries will still be scatter-gather.
Upvotes: 1