Reputation: 101
I want to test MongoDB's insert speed.There are 4 shardings,3 config, 4 mongos, chunksize 64M,and when I insert 100 double[100000] data,it can auto sharding ,but the insert speed didn't improve.
(1) I create a database,create a collection "docs",and insert 100 double[100000],it takes 30S
(2) I drop the "docs",create a new collection "docs",enablesharding ,insert a {name:"hashed"},it takes 30S or more.
Every sharding almost have the same data,or chunk's number,I have changed chunksize 5MB,20MB,100MB,200MB, but can't make the speed reduce 3/4.
Sharding reduces the number of operations each shard handles,so how can I reduce the insert time,improve the insert speed via add sharding ? Or my test data is wrong ,it too small to display mongodb's performance? I stop the Balancer(),sh.stopBalancer(),sh.status()
mongos> sh.status()
--- Sharding Status ---
sharding version: {
"_id" : 1,
"version" : 4,
"minCompatibleVersion" : 4,
"currentVersion" : 5,
"clusterId" : ObjectId("5450ed56eb3978383f81a863")
}
shards:
{ "_id" : "s1", "host" : "192.168.137.101:27017" }
{ "_id" : "s2", "host" : "192.168.137.102:27018" }
{ "_id" : "s3", "host" : "192.168.137.103:27019" }
{ "_id" : "s4", "host" : "192.168.137.104:27020" }
databases:
{ "_id" : "admin", "partitioned" : false, "primary" : "config" }
{ "_id" : "liu", "partitioned" : true, "primary" : "s2" }
liu.docs
shard key: { "name" : "hashed" }
chunks:
s1 4
s2 7
s3 6
s4 5
too many chunks to print, use verbose if you want to force print
{ "_id" : "test", "partitioned" : false, "primary" : "s1" }
Every sharding have the data, which means mongodb have distributed evenly via shard key?But why the insert speed not reduce,is there any wrong ?Do you have the same situation or reduce the time successfully?
I use multithreadings to reduce the time successfully.
Upvotes: 0
Views: 1901
Reputation: 4212
You cannot "dramatically" improve insert speed via sharding. There are too many decisions to be made during the insert , if you want to almost-equally distribute the insert operations across Replica Sets . Actually ,by sharding , you have more operations in your hand to deal with , than inserting to a single instance.
If you want a real speed and you can afford risking some stability , your best bet is turning of the write acknowlegdements and use fire&forget inserts.
Upvotes: 0
Reputation: 1935
There are 2 possible scenarios here:
1. Inserts are spread evenly across all the shards, in such a scenario read as well as write performance will improve linearly with every shard added. The number of mongos(routers) can also be increased.
2. Inserts are focused on only one or a subset of shards, in such a scenario adding shards will not help increase performance. This probably indicates that the shardKey has lower cardinality or randomness factor. Check out this link : Choosing a Shard key
Since you have not given us sufficient data(with respect to shardKey used and inserts affecting which all shards), you need to deduce which of the above 2 scenarios is preventing improvements in write performance.
Hope this helps.
Upvotes: 1