The Georgia
The Georgia

Reputation: 1075

Grouping servers in grafana/prometheus

I would like to group database servers in grafana dashboards e.g, servers belonging to the same cluster, db-pxc, end up looking like this:

DB-PXC
    -Disk_Performance
        -db-pxc-1
        -db-pxc-2
        -db-pxc-3
        ...
    -Disk_Space    
        -db-pxc-1
        -db-pxc-2
        -db-pxc-3
        ...
    -MySQL_Overview
        -db-pxc-1
        -db-pxc-2
        -db-pxc-3
        ...
    -MySQL_Table_statistics
        -db-pxc-1
        -db-pxc-2
        -db-pxc-3
        ...
     ...

So if i click on the parent dashboard Disk_Space, it displays disk space sub dashboard for each host in the db-pxc cluster (db-pxc-1, db-pxc-2, db-pxc-3, ...). That way i can compare the disk space usage of all my servers in one cluster on a single page. We already have this setup in cacti, but not sure how we can achieve the same with grafana.

We are using Promethues monitoring system, node_exporter & mysqld_exporter for collecting statistics on each individual server, and grafana for viewing the dashboard. To view data of the mysqld and node exporters supported by prometheus in grafana, we are using the Percona Grafana plugin.

Below is an example of what i am asking for. In the picture below, the db cluster name is kdb, db-kdb-1, db-kdb-2, db-kdb-3 and db-kdb-4 being being part of the nodes that forms the cluster. So like seen below, when i click on CPU, it shows all CPU usage of my kdb cluster nodes.

enter image description here

Upvotes: 2

Views: 5018

Answers (2)

Djidiouf
Djidiouf

Reputation: 840

You need to create a Prometheus target with all the instances' IP of your given cluster and use its job name in Grafana.

Create the following target file with the IP of your cluster's instances:

- targets:
  - 10.149.121.21:9100
  - 10.149.121.22:9100
  - 10.149.121.23:9100
  - 10.149.121.24:9100
  labels:
    job: kdbcluster

Then, on Grafana, you create 4 new graphs with the following respective queries:

100 - (avg by (instance) (irate(node_cpu{instance="10.149.121.21:9100",mode="idle", job="kdbcluster"}[5m])) * 100)
100 - (avg by (instance) (irate(node_cpu{instance="10.149.121.22:9100",mode="idle", job="kdbcluster"}[5m])) * 100)
100 - (avg by (instance) (irate(node_cpu{instance="10.149.121.23:9100",mode="idle", job="kdbcluster"}[5m])) * 100)
100 - (avg by (instance) (irate(node_cpu{instance="10.149.121.24:9100",mode="idle", job="kdbcluster"}[5m])) * 100)

If you want to have all graphs on the same one, you can use that query:

100 - (avg by (instance) (irate(node_cpu{mode="idle", job="kdbcluster"}[5m])) * 100)

If you want to add to the previous graph a line which is the average of all the instances CPU load, you can use this query:

100 - (avg (irate(node_cpu{mode="idle", job="kdbcluster"}[5m])) * 100)

Upvotes: 2

brian-brazil
brian-brazil

Reputation: 34112

For say percentage root filesystem usage you'd have one graph with an expression like:

100 - node_filesystem_free{job='node',mountpoint='/'} / node_filesystem_size{job='node',mountpoint='/'} * 100

which would show the result for all matching machines.

Upvotes: 3

Related Questions