Reputation: 631
We are using Postgres as data source for Grafana dashboard. While running the query, the top
command shows that postgres is using 100% CPU but actual CPU is somewhat 6%, this results in slow query response and thereby Grafana shows 524 timeout error (Cloudflare) (Refer Below Screenshots)
System COnfiguration: OS: Ubuntu 16.04 RAM: 16GB CPU: 16 Core Hyper-V
Below is the configuration file
postgressql.conf
max_connections = 300
unix_socket_directories = '/var/run/postgresql'
ssl = true
shared_buffers = 4GB
work_mem = 13981kB
maintenance_work_mem = 1GB
dynamic_shared_memory_type = posix
effective_io_concurrency = 200
max_worker_processes = 16
wal_buffers = 16MB
max_wal_size = 8GB
min_wal_size = 2GB
checkpoint_completion_target = 0.9
random_page_cost = 1.1
effective_cache_size = 12GB
log_line_prefix = '%t [%p-%l] %q%u@%d '
log_timezone = 'localtime'
stats_temp_directory = '/var/run/postgresql/9.5-main.pg_stat_tmp'
datestyle = 'iso, mdy'
timezone = 'localtime'
lc_messages = 'en_US.UTF-8'
lc_monetary = 'en_US.UTF-8'
lc_numeric = 'en_US.UTF-8'
lc_time = 'en_US.UTF-8'
default_text_search_config = 'pg_catalog.english'
I am newbie in postgre, please let me know if I have missed something or if you have any suggesstions.
Update My /boot folder is 100% full, not sure if that makes any difference.
Upvotes: 0
Views: 1547
Reputation: 247215
The summary line in the top
output you show has the cumulative CPU usage.
One of your cores is busy with a PostgreSQL query, but that is only one of the several cores in the machine, so that is included in the 6.6% "user" CPU shown in the summary line.
The alarming part about that output are the 74.3% "system" CPU time. Three quarters of the cores in your machine are doing operating system maintenance work. There is something seriously wrong. Perhaps you didn't disable transparent huge pages? But to come to a conclusion here, you need deeper analysis by someone who understands Linux.
Upvotes: 2