Pierre Le Roux
Pierre Le Roux

Reputation: 494

Read from Mongodb secondary server / performance

I'm building an architecture using Mongodb.

I saw that it was possible (and a best practice) to send read requests for statistics needs on secondary servers. The consequence would be a better performance.

As I already did Mongodb certifications (Node JS and DBA) and I read this http://docs.mongodb.org/manual/core/read-preference/, I was wondering about the performance gap we can expect.

Actually, I don't understand very well how a secondary server, which receives all the requests the primary already received (via the oplog), could be more efficient. The number of writes on disks are the same. So, even if this server can only read data, it also writes the same amount of data.

Does anybody can explain how Mongodb achieves (if it really does it) to deliver better read performance on secondary servers ?

Thanks for your help.

Upvotes: 2

Views: 1815

Answers (4)

Arun Pratap Singh
Arun Pratap Singh

Reputation: 3646

There are some use case presented here

Yes definitely if write's are significantly equal or high than read's it might not benefit you.

some use cases presented on mongodb documentation. https://docs.mongodb.com/manual/core/read-preference-use-cases/

Use cases where in we can use secondaries for read's.

  • Reporting/analytics where in stale data is acceptable.
  • Write is significantly low compared to Reads(still have to take care of stale data)
  • Providing local reads for geographically distributed applications.

https://medium.com/@arun2pratap/mongodb-read-from-secondary-to-boost-performance-dca938a680ac

this might be helpful.

Upvotes: 0

Pierre Le Roux
Pierre Le Roux

Reputation: 494

Thanks to @wdberkeley response on hidden replica, i found another link on delayed member in a replica set.

As for most statistics use cases we doesn't need to have up-to-date information, we can imagine that a server stops to read the oplog.

For example, we can keep oplog during 30 hours and we have delay of 24 hours on a replica in order to consume the oplog only during nights.

Then, during the day, it doesn't make any write operations on disks and it should allow better performance to make bigger read requests for statistics purposes.

Upvotes: 2

alernerdev
alernerdev

Reputation: 2064

Pierre, the whole notion of replica sets is about failover -- it is not about spreading your load for better performance. Although you can do your reads against secondaries as common wisdom seems to suggest, you have to consider what will happen during a server failure when you do not have the luxury of separate servers and all of your writes and reads go to the same server -- will you discover at that point that your primary server is under provisioned? You are correct in assuming that the secondary does the same amount of work as the primary and going to the secondaries for reads is not more efficient -- but if you have multiple secondary servers, you can spread your reads across them -- thus each server is responding to fewer reads. However, my original statement still stands -- will your system be able to handle the load if one of the servers fails.

Upvotes: 4

Markus W Mahlberg
Markus W Mahlberg

Reputation: 20703

It doesn't improve the performance of the query, but your analytics would have little to no impact on the primary if you read from the secondary, thus reducing the overall impact on the application. Which is what I would call efficient.

Upvotes: 1

Related Questions