Reputation: 987
EDIT: That was a red herring - see answer below.
We have a set of databases on MongoDB Atlas (M10, M20 clusters), and I noticed that while planning and executing queries is super fast, actually returning bigger sets of documents takes ages.
As an example, the query below fetches 30K IDs from a collection (which contains just 100K documents). This takes 15 seconds to return that data from an M20 cluster (either to a co-located node app, or to my local machine):
db.mycollection.find({}, {_id: 1}).limit(30000)
In comparison: my toy PostgreSQL instance that costs a fraction of the M20 cluster (also located in the same AWS location) needs 0.5 seconds to return me 100K full rows from a table with 10M rows.
I understand that MongoDb has some overhead due to JSON, but that performance difference is so huge that I can't help but wonder whether that is really just the performance I can expect from a MongoDB, or whether something is severely off with those clusters?
Upvotes: 2
Views: 676
Reputation: 987
Turns out this was a huge red herring :/
limit
s the results behind the scene, as it shows the data in pages. So completely different reason that lead to the same delay, which sent me on a wild goose chase :)Silver lining: Not relying on default batch sizes made a huge difference here. Explicitly setting the batch size to 5K or 10K go me an easy 30% performance boost, also when triggering the queries through the Node client library.
Upvotes: 0