Rostyslav Malenko
Rostyslav Malenko

Reputation: 569

Aerospike: Unbalanced number of connections on nodes

We had two nodes in the cluster before we added two new additional nodes. The status of this cluster is OK. But nodes have an unbalanced number of connections. Two old nodes 1,2 have a balanced number of connections among themselves. New nodes 3,4 have a balanced too among themselves. But a number of connections among 1-2 vs 3-4 are unbalanced.

The hardware and configurations are identical on these four nodes.

Could someone prompt, please, did I miss something while I added new nodes, how it fixes?

Thanks

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~Network Information (2020-04-23 15:28:01 UTC)~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   Node   Node                   Ip       Build   Cluster   Migrations        Cluster     Cluster   Principal   Client       Uptime   
     .     Id                    .           .      Size            .            Key   Integrity           .    Conns            .   
ny-as-1:3000   2C     192.168.11.17:3000   C-4.7.0.2         4      0.000     E97140B6302A   True        EE            2637   3219:11:28   
ny-as-2:3000   *EE    192.168.11.18:3000   C-4.7.0.2         4      0.000     E97140B6302A   True        EE            2525   3219:19:13   
ny-as-3:3000   2E     192.168.11.19:3000   C-4.7.0.2         4      0.000     E97140B6302A   True        EE             356   195:46:21    
ny-as-4:3000   3E     192.168.11.20:3000   C-4.7.0.2         4      0.000     E97140B6302A   True        EE             371   195:39:53    
Number of rows: 4

Data size/ total record spreads

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~Namespace Usage Information (2020-04-23 15:28:01 UTC)~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Namespace           Node       Total   Expirations,Evictions     Stop         Disk    Disk     HWM   Avail%         Mem     Mem    HWM      Stop          PI         PI      PI     PI   
        .              .     Records                       .   Writes         Used   Used%   Disk%        .        Used   Used%   Mem%   Writes%        Type       Used   Used%   HWM%   
dsp         ny-as-1:3000   167.326 M   (99.234 B, 0.000)       false     54.458 GB   7       85      89        9.973 GB   18      85     90        undefined   9.973 GB   0       N/E    
dsp         ny-as-2:3000   160.925 M   (95.320 B, 0.000)       false     52.376 GB   7       85      90        9.592 GB   18      85     90        undefined   9.592 GB   0       N/E    
dsp         ny-as-3:3000   156.493 M   (10.839 B, 0.000)       false     50.918 GB   7       85      90        9.328 GB   17      85     90        undefined   9.328 GB   0       N/E    
dsp         ny-as-4:3000   158.321 M   (10.985 B, 0.000)       false     51.503 GB   7       85      90        9.437 GB   17      85     90        undefined   9.437 GB   0       N/E    
dsp                               643.066 M   (216.379 B, 0.000)               209.254 GB                            38.330 GB                                        0.000 B                   
Number of rows: 5

Upvotes: 1

Views: 534

Answers (1)

Meher
Meher

Reputation: 2939

This is not necessarily an issue... Anytime there is a slow down or a burst in a workload, more connections would be created... If the workload is able to cause connections to still be used frequently enough after this temporary burst or slowdown, those connection would just stay active and this wouldn't be a cause of concern. Let me give you a simple example to illustrate this:

Imagine your transactions are completing within 10ms and your initial workload is 100 transactions per second. You can sustain this with a single connection (100 round trips of 10ms each within 1 second). Now, if for some reason the server or client or network in between slows down to cause transactions to take 100ms instead of 10ms, you will need 10 connection to sustain the throughput of 100tps, so you will open 9 new connections, each connection would now do 10 tps to achieve the total of 100 tps across all 10.

When the latency goes back down to 10ms per transactions, you would end up using each one of the 10 connections 10 times per second, instead of using a single connection 100 times per second. As long as each connection is used once every 55 seconds (default idle time on the client), those 10 connections will remain active and used.. a new node deployed and joining at a time when transactions are processing within 10ms would just need 1 connection where as the older node would still use 10 times more.

You can think of this as some sort of hysteresis if that makes sense...

Hope this helps!

Upvotes: 2

Related Questions