Kartik Chauhan
Kartik Chauhan

Reputation: 3068

Getting error "Get http://localhost:9443/metrics: dial tcp 127.0.0.1:9443: connect: connection refused"

I'm trying to configure Prometheus and Grafana with my Hyperledger fabric v1.4 network to analyze the peer and chaincode mertics. I've mapped peer container's port 9443 to my host machine's port 9443 after following this documentation. I've also changed the provider entry to prometheus under metrics section in core.yml of peer. I've configured prometheus and grafana in docker-compose.yml in the following way.

  prometheus:
    image: prom/prometheus:v2.6.1
    container_name: prometheus
    volumes:
    - ./prometheus/:/etc/prometheus/
    - prometheus_data:/prometheus
    command:
    - '--config.file=/etc/prometheus/prometheus.yml'
    - '--storage.tsdb.path=/prometheus'
    - '--web.console.libraries=/etc/prometheus/console_libraries'
    - '--web.console.templates=/etc/prometheus/consoles'
    - '--storage.tsdb.retention=200h'
    - '--web.enable-lifecycle'
    restart: unless-stopped
    ports:
    - 9090:9090
    networks:
    - basic
    labels:
    org.label-schema.group: "monitoring"

  grafana:
    image: grafana/grafana:5.4.3
    container_name: grafana
    volumes:
    - grafana_data:/var/lib/grafana
    - ./grafana/datasources:/etc/grafana/datasources
    - ./grafana/dashboards:/etc/grafana/dashboards
    - ./grafana/setup.sh:/setup.sh
    entrypoint: /setup.sh
    environment:
    - GF_SECURITY_ADMIN_USER={ADMIN_USER}
    - GF_SECURITY_ADMIN_PASSWORD={ADMIN_PASS}
    - GF_USERS_ALLOW_SIGN_UP=false
    restart: unless-stopped
    ports:
    - 3000:3000
    networks:
    - basic
    labels:
    org.label-schema.group: "monitoring"

When I curl 0.0.0.0:9443/metrics on my remote centos machine, I get all the list of metrics. However, when I run Prometheus with the above configuration, it throws the error Get http://localhost:9443/metrics: dial tcp 127.0.0.1:9443: connect: connection refused. This is what my prometheus.yml looks like.

global:
  scrape_interval:     15s
  evaluation_interval: 15s

scrape_configs:
  - job_name: 'prometheus'
    scrape_interval: 10s
    static_configs:
      - targets: ['localhost:9090']

  - job_name: 'peer_metrics'
    scrape_interval: 10s
    static_configs:
      - targets: ['localhost:9443']

Even, when I go to endpoint http://localhost:9443/metrics in my browser, I get all the metrics. What am I doing wrong here. How come Prometheus metrics are being shown on its interface and not peer's?

Upvotes: 47

Views: 99619

Answers (8)

abbas
abbas

Reputation: 7071

Since the targets are not running inside the prometheus container, they cannot be accessed through localhost. You need to access them through the host private IP or by replacing localhost with docker.for.mac.localhost (deprecated) or host.docker.internal (preferred).

On Windows:

  • host.docker.internal (tested on win10, win11)

On Mac:

  • docker.for.mac.localhost

Upvotes: 68

nikhil
nikhil

Reputation: 191

If you pointing to some service in another docker container, your localhost might be represented not as localhost but as servicename ( service name that shown in docker ps ) or internal ip that running the docker container .

prometheus.yaml

 - job_name: "node-exporter"

    static_configs:
      - targets: ["nodeexporter:9100"] // docker container name

Upvotes: 0

Dishone Prabu J
Dishone Prabu J

Reputation: 9

Run both the containers in the same network of docker it will fix the issue.

success log

Upvotes: 0

I realized that I got this error because kubeprostack pods like prometheus in AKS are also running. When I scaled down the pods related to kubeprostack in the "deployments" and "deamonsets" sections of AKS to 1, the problem was solved and I was able to successfully connect to Grafana Prometheus. Because both prometheus and kubeprostack were trying to work. Problem solved when only prometheus pods remained.

Post-procedure status image

Upvotes: 0

Shakiba Moshiri
Shakiba Moshiri

Reputation: 23774

NOTE
This solution is not for docker swarm. It for standalone containers (multi-container) aimed to be run on overlay network.

The same error we get when using overlay network and here is the solution (statically NOT dynamically)

this config does not work:

global:
  scrape_interval:     15s
  evaluation_interval: 15s

  external_labels:
    monitor: 'promswarm'

scrape_configs:
  - job_name: 'prometheus'
    static_configs:
      - targets: ['localhost:9090']

  - job_name: 'node'
    static_configs:
      - targets: [ 'localhost:9100' ]

Nor does this one even when http://docker.for.mac.localhost:9100/ is available, yet prometheus cannot find node-exporter. So the below one did not work either:

global:
  scrape_interval:     15s
  evaluation_interval: 15s

  external_labels:
    monitor: 'promswarm'

scrape_configs:
  - job_name: 'prometheus'
    static_configs:
      - targets: ['localhost:9090']


  - job_name: 'node'
    static_configs:
      - targets: [ 'docker.for.mac.localhost:9100'  ]

But simply using its container ID we can have access to that service via its port number.

docker ps
CONTAINER ID   IMAGE                    COMMAND                  CREATED          STATUS          PORTS                                       NAMES
a58264faa1a4   prom/prometheus          "/bin/prometheus --c…"   5 minutes ago    Up 5 minutes    0.0.0.0:9090->9090/tcp, :::9090->9090/tcp   unruffled_solomon
62310f56f64a   grafana/grafana:latest   "/run.sh"                42 minutes ago   Up 42 minutes   0.0.0.0:3000->3000/tcp, :::3000->3000/tcp   wonderful_goldberg
7f1da9796af3   prom/node-exporter       "/bin/node_exporter …"   48 minutes ago   Up 48 minutes   0.0.0.0:9100->9100/tcp, :::9100->9100/tcp   intelligent_panini

So we have 7f1da9796af3 prom/node-exporter ID and we can update our yml file to:

global:
  scrape_interval:     15s
  evaluation_interval: 15s

  external_labels:
    monitor: 'promswarm'

scrape_configs:
  - job_name: 'prometheus'
    static_configs:
      - targets: ['localhost:9090']


  - job_name: 'node'
    static_configs:
      - targets: [ '7f1da9796af3:9100'  ]

not working

enter image description here

working

enter image description here


UPDATE

I myself was not happy with this hard-coded solution , so after some other search found a more reliable approach using --network-alias NAME which within the overlay network , that container will be route-able by that name. So the yml looks like this:

scrape_configs:
  - job_name: 'prometheus'
    static_configs:
      - targets: ['localhost:9090']


  - job_name: 'node'
    static_configs:
      - targets: [ 'node_exporter:9100' ]

In which the name node_exporter is an alias which has been created with run subcommand. e.g.

docker run --rm  -d  -v "/:/host:ro,rslave" --network cloud --network-alias node_exporter --pid host -p 9100:9100   prom/node-exporter  --path.rootfs=/host

And in a nutshell it says on the overlay cloud network you can reach node-exporter using node_exporter:<PORT>.

Upvotes: 3

Dashrath Mundkar
Dashrath Mundkar

Reputation: 9174

Well I remember I resolved the problem by downloading Prometheus node exporter for windows.

check out this link https://medium.com/@facundofarias/setting-up-a-prometheus-exporter-on-windows-b3e45f1235a5

Upvotes: 0

Aviv
Aviv

Reputation: 14467

The problem: On Prometheus you added a service for scraping but on http://localhost:9090/targets the endpoint state is Down with an error:

Get http://localhost:9091/metrics: dial tcp 127.0.0.1:9091: connect: connection refused

enter image description here

Solution: On prometheus.yml you need to verify that

  1. scraping details pointing to the right endpoint.
  2. the yml indentation is correct.
  3. using curl -v http://<serviceip>:<port>/metrics should prompt the metrics in plaintext in your terminal.

Note: If you pointing to some service in another docker container, your localhost might be represented not as localhost but as servicename ( service name that shown in docker ps ) or docker.host.internal (the internal ip that running the docker container ).

for this example: I'll be working with 2 dockers containers prometheus and "myService".

sudo docker ps

CONTAINER ID        IMAGE                     CREATED                        PORTS                    NAMES
abc123        prom/prometheus:latest        2 hours ago               0.0.0.0:9090->9090/tcp         prometheus
def456        myService/myService:latest         2 hours ago               0.0.0.0:9091->9091/tcp         myService

and then edit the file prometheus.yml (and rerun prometheus)

- job_name: myService
  scrape_interval: 15s
  scrape_timeout: 10s
  metrics_path: /metrics
  static_configs:
    - targets: // Presenting you 3 options
      - localhost:9091 // simple localhost 
      - docker.host.internal:9091 // the localhost of agent that runs the docker container
      - myService:9091 // docker container name (worked in my case)
      
        

Upvotes: 23

antweiss
antweiss

Reputation: 2879

Your prometheus container isn't running on host network. It's running on its own bridge (the one created by docker-compose). Therefore the scrape config for peer should point at the IP of the peer container.

Recommended way of solving this:

  • Run prometheus and grafana in the same network as the fabric network. In you docker-compose for prometheus stack you can reference it like this:
networks:
  default:
    external:
      name: <your-hyperledger-network>

(use docker network ls to find the network name )

Then you can use http://<peer_container_name>:9443 in your scrape config

Upvotes: 11

Related Questions