dotsinspace
dotsinspace

Reputation: 691

How to really decide Patroni Configuration based on System Configuration

I am trying to build Dedicated Patroni Cluster Which can basically handles my traffic. so currently dedicated instance is about 124GB Ram, 31vCPU, 1 TB SSD now with that said. i did little bit google didn't got that much help on what should be the correct configuration. so i asked Chat GPT to help me out in this and it came up with the following solution.

loop_wait: 10
maximum_lag_on_failover: 1048576
postgresql:
  parameters:
    checkpoint_completion_target: 0.9
    checkpoint_timeout: 30min
    citus.node_conninfo: sslrootcert=/etc/ssl/certs/ssl-cert-snakeoil.pem sslkey=/etc/ssl/private/ssl-cert-snakeoil.key sslcert=/etc/ssl/certs/ssl-cert-snakeoil.pem sslmode=verify-ca
    effective_cache_size: 96GB
    effective_io_concurrency: 200
    maintenance_work_mem: 2GB
    max_connections: 3000
    max_parallel_workers_per_gather: 8
    random_page_cost: 1.1
    shared_buffers: 32GB
    ssl: 'on'
    ssl_ca_file: /etc/ssl/certs/ssl-cert-snakeoil.pem
    ssl_cert_file: /etc/ssl/certs/ssl-cert-snakeoil.pem
    ssl_key_file: /etc/ssl/private/ssl-cert-snakeoil.key
    synchronous_commit: false
    wal_buffers: 32MB
    wal_level: replica
    work_mem: 64MB
  pg_hba:
  - local all all trust
  - hostssl replication replicator all md5 clientcert=verify-ca
  - hostssl all all all md5 clientcert=verify-ca
  use_pg_rewind: true
retry_timeout: 10
synchronous_mode: quorum
ttl: 30
[![enter image description here](https://i.sstatic.net/oeeKc4A4.png)](https://i.sstatic.net/oeeKc4A4.png)

Now i didn't know much about it on what is a correct approch to decide the configuration. so i used it. Now my server is getting Hits around 2.8Lac Per day with that much traffic i saw db then it says following information (Kindly See Image).

Now Issue is this that my server starts complaining about following Prisma error.

1|index  | PrismaClientKnownRequestError:
1|index  | Invalid `prisma.$queryRawUnsafe()` invocation:
1|index  | Raw query failed. Code: `53100`. Message: `ERROR: could not resize shared memory segment "/PostgreSQL.2217184920" to 610912 bytes: No space left on device`
1|index  |     at _n.handleRequestError (/home/smsflash/smsflash__node/.build/node_modules/.pnpm/@[email protected][email protected]/node_modules/@prisma/client/runtime/library.js:121:7749)
1|index  |     at _n.handleAndLogRequestError (/home/smsflash/smsflash__node/.build/node_modules/.pnpm/@[email protected][email protected]/node_modules/@prisma/client/runtime/library.js:121:7057)
1|index  |     at _n.request (/home/smsflash/smsflash__node/.build/node_modules/.pnpm/@[email protected][email protected]/node_modules/@prisma/client/runtime/library.js:121:6741)
1|index  |     at async l (/home/smsflash/smsflash__node/.build/node_modules/.pnpm/@[email protected][email protected]/node_modules/@prisma/client/runtime/library.js:130:9355) {
1|index  |   code: 'P2010',
1|index  |   clientVersion: '5.18.0',
1|index  |   meta: {
1|index  |     code: '53100',
1|index  |     message: 'ERROR: could not resize shared memory segment "/PostgreSQL.2217184920" to 610912 bytes: No space left on device'
1|index  |   }
1|index  | }

Which is basically about Shared Memory getting exaushted. Now I am confused How much this server can handle the traffic. There is no other instance etc it just a Pure Patroni cluster.

I am looking for the Correct Approch to decide the correct configuration for such systems.

Upvotes: 0

Views: 28

Answers (1)

Laurenz Albe
Laurenz Albe

Reputation: 245813

Reduce max_connections to 100 and use a connection pool.

Reduce shared_buffers to 8GB.

If the problem persists, try setting max_parallel_workers_per_gather to 0 to disable parallel query. There seems to be a bug in PostgreSQL that sometimes makes parallel query consume excessive amounts of memory.

Upvotes: 0

Related Questions