Reputation: 494
I'm building an architecture using Mongodb.
I saw that it was possible (and a best practice) to send read requests for statistics needs on secondary servers. The consequence would be a better performance.
As I already did Mongodb certifications (Node JS and DBA) and I read this http://docs.mongodb.org/manual/core/read-preference/, I was wondering about the performance gap we can expect.
Actually, I don't understand very well how a secondary server, which receives all the requests the primary already received (via the oplog), could be more efficient. The number of writes on disks are the same. So, even if this server can only read data, it also writes the same amount of data.
Does anybody can explain how Mongodb achieves (if it really does it) to deliver better read performance on secondary servers ?
Thanks for your help.
Upvotes: 2
Views: 1815
Reputation: 3646
There are some use case presented here
Yes definitely if write's are significantly equal or high than read's it might not benefit you.
some use cases presented on mongodb documentation. https://docs.mongodb.com/manual/core/read-preference-use-cases/
Use cases where in we can use secondaries for read's.
https://medium.com/@arun2pratap/mongodb-read-from-secondary-to-boost-performance-dca938a680ac
this might be helpful.
Upvotes: 0
Reputation: 494
Thanks to @wdberkeley response on hidden replica, i found another link on delayed member in a replica set.
As for most statistics use cases we doesn't need to have up-to-date information, we can imagine that a server stops to read the oplog.
For example, we can keep oplog during 30 hours and we have delay of 24 hours on a replica in order to consume the oplog only during nights.
Then, during the day, it doesn't make any write operations on disks and it should allow better performance to make bigger read requests for statistics purposes.
Upvotes: 2
Reputation: 2064
Pierre, the whole notion of replica sets is about failover -- it is not about spreading your load for better performance. Although you can do your reads against secondaries as common wisdom seems to suggest, you have to consider what will happen during a server failure when you do not have the luxury of separate servers and all of your writes and reads go to the same server -- will you discover at that point that your primary server is under provisioned? You are correct in assuming that the secondary does the same amount of work as the primary and going to the secondaries for reads is not more efficient -- but if you have multiple secondary servers, you can spread your reads across them -- thus each server is responding to fewer reads. However, my original statement still stands -- will your system be able to handle the load if one of the servers fails.
Upvotes: 4
Reputation: 20703
It doesn't improve the performance of the query, but your analytics would have little to no impact on the primary if you read from the secondary, thus reducing the overall impact on the application. Which is what I would call efficient.
Upvotes: 1