Ilja
Ilja

Reputation: 46499

How to find optimal size of connection pool for single mongo nodejs driver

I am using official mongo nodejs driver with default settings, but was digging deeper into options today and apparently there is an option of maxPoolSize that is set to 100 by default.

My understanding of this is that single nodejs process can establish up to 100 connections, thus allowing mongo to handle 100 reads/writes simultaneously in paralel?

If so, it seems that setting this number higher could only benefit the performance, but I am not sure hence decided to ask here.

Assuming default setup with no indexes, is there a way to determine (based on cpu's and memory of the db) what the optimal connection number for pool should be?

We can also assume that nodejs process itself is not a bottleneck (i.e can be scaled horizontally).

Upvotes: 3

Views: 444

Answers (1)

Alex Blex
Alex Blex

Reputation: 37048

Good question =)

it seems that setting this number higher could only benefit the performance

It does indeed. I mean it seems, and it would be the case for an abstract nodejs process in a vacuum with unlimited resources. Connections are not free, so there are things to consider:

  • limited connection quota on the server. Atlas in particular, but even self-hosted cluster has only 65k sockets. Remember the driver keeps them open to reuse, and the default timeout per cursor is 30 minutes of inactivity.
  • single thread clientside. BSON serialisation blocks event loop and is quite expensive, e.g. see the flamechart in this answer https://stackoverflow.com/a/72264469/1110423 . Blocking the loop, you increase time cursors from the previous point remain open, and in worst case get performance degradation.
  • limited RAM. Each connection require ~1 MB serverside.

Assuming default setup with no indexes

You have at least _id, and you should have more if we are talking about performance

is there a way to determine what the optimal connection number for pool should be?

I'd love to know that too. There are too many factors to consider, not only CPA/RAM, but also data shape, query patterns, etc. This is what dbops are for. Mongo cluster requires some attention, monitoring and adjustments for optimal operations. In many cases it's more cost efficient to scale up the cluster than optimise the app.

We can also assume that nodejs process itself is not a bottleneck (i.e can be scaled horizontally).

This is quite wild assumption. The process cannot scale horisontally. It's on the OS level. Once you have a process descriptor, it's locked to it till the death. You can use a node cluster to utilise all CPU cores, can even have multiple servers running the same nodejs and balance the load, but none of them will share connections from the pool. The pool is local to nodejs process.

Upvotes: 2

Related Questions