Sam Stoelinga
Sam Stoelinga

Reputation: 5021

Error with Ops Agent GCE metadata unauthenticated

Trying to get Ops Agent working and used the following command to install it:

curl -sSO https://dl.google.com/cloudagents/add-google-cloud-ops-agent-repo.sh
sudo bash add-google-cloud-ops-agent-repo.sh --also-install

I see the following error in the logs of journalctl -u google-cloud-ops-agent-opentelemetry-collector -xn

otelopscol[2706]: 2022-02-06T21:50:36.140Z        info        exporterhelper/queued_retry.go:215        Exporting failed. Will retry the request after interval.        {"kind": "exporter", "name": "googlecloud", "error": "[rpc error: code = Unauthenticated desc = transport: per-RPC creds failed due to error: metadata: GCE metadata \"instance/service-accounts/default/token?scopes=https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fcloud-platform%2Chttps%3A%2F%2Fwww.googleapis.com%2Fauth%2Fmonitoring%2Chttps%3A%2F%2Fwww.googleapis.com%2Fauth%2Fmonitoring.read%2Chttps%3A%2F%2Fwww.googleapis.com%2Fauth%2Fmonitoring.write\" not defined; rpc error: code = Unauthenticated desc = transport: per-RPC creds failed due to error: metadata: GCE metadata \"instance/service-accounts/default/token?scopes=https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fcloud-platform%2Chttps%3A%2F%2Fwww.googleapis.com%2Fauth%2Fmonitoring%2Chttps%3A%2F%2Fwww.googleapis.com%2Fauth%2Fmonitoring.read%2Chttps%3A%2F%2Fwww.googleapis.com%2Fauth%2Fmonitoring.write\" not defined; rpc error: code = Unauthenticated desc = transport: per-RPC creds failed due to error: metadata: GCE metadata \"instance/service-accounts/default/token?scopes=https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fcloud-platform%2Chttps%3A%2F%2Fwww.googleapis.com%2Fauth%2Fmonitoring%2Chttps%3A%2F%2Fwww.googleapis.com%2Fauth%2Fmonitoring.read%2Chttps%3A%2F%2Fwww.googleapis.com%2Fauth%2Fmonitoring.write\" not defined]", "interval": "14.115202828s"}

The services otherwise look good and are running but the UI reports that Ops Agent wasn't actually running which I suspect is due to no data being sent back.

Here is the status of running agents:

google-cloud-ops-agent-opentelemetry-collector.service - Google Cloud Ops Agent - Metrics Agent
     Loaded: loaded (/lib/systemd/system/google-cloud-ops-agent-opentelemetry-collector.service; static; vendor preset: enabled)
     Active: active (running) since Sat 2022-02-05 04:38:41 UTC; 1 day 17h ago
    Process: 2690 ExecStartPre=/opt/google-cloud-ops-agent/libexec/google_cloud_ops_agent_engine -service=otel -in /etc/google-clou>
   Main PID: 2706 (otelopscol)
      Tasks: 9 (limit: 2369)
     Memory: 193.6M
     CGroup: /system.slice/google-cloud-ops-agent-opentelemetry-collector.service
             └─2706 /opt/google-cloud-ops-agent/subagents/opentelemetry-collector/otelopscol --config=/run/google-cloud-ops-agent-o>

Feb 06 21:55:53 mongo-1 otelopscol[2706]:         /root/go/pkg/mod/go.opentelemetry.io/[email protected]/exporter/exporterhelper/qu>
Feb 06 21:55:53 mongo-1 otelopscol[2706]: go.opentelemetry.io/collector/exporter/exporterhelper.(*metricsSenderWithObservability).s>
Feb 06 21:55:53 mongo-1 otelopscol[2706]:         /root/go/pkg/mod/go.opentelemetry.io/[email protected]/exporter/exporterhelper/me>
Feb 06 21:55:53 mongo-1 otelopscol[2706]: go.opentelemetry.io/collector/exporter/exporterhelper.(*queuedRetrySender).start.func1
Feb 06 21:55:53 mongo-1 otelopscol[2706]:         /root/go/pkg/mod/go.opentelemetry.io/[email protected]/exporter/exporterhelper/qu>
Feb 06 21:55:53 mongo-1 otelopscol[2706]: go.opentelemetry.io/collector/exporter/exporterhelper/internal.consumerFunc.consume
Feb 06 21:55:53 mongo-1 otelopscol[2706]:         /root/go/pkg/mod/go.opentelemetry.io/[email protected]/exporter/exporterhelper/in>
Feb 06 21:55:53 mongo-1 otelopscol[2706]: go.opentelemetry.io/collector/exporter/exporterhelper/internal.(*boundedMemoryQueue).Star>
Feb 06 21:55:53 mongo-1 otelopscol[2706]:         /root/go/pkg/mod/go.opentelemetry.io/[email protected]/exporter/exporterhelper/in>
Feb 06 21:55:53 mongo-1 otelopscol[2706]: 2022-02-06T21:55:53.145Z        info        exporterhelper/queued_retry.go:215        Exp>

● google-cloud-ops-agent.service - Google Cloud Ops Agent
     Loaded: loaded (/lib/systemd/system/google-cloud-ops-agent.service; enabled; vendor preset: enabled)
     Active: active (exited) since Sat 2022-02-05 04:38:41 UTC; 1 day 17h ago
    Process: 2691 ExecStartPre=/opt/google-cloud-ops-agent/libexec/google_cloud_ops_agent_engine -in /etc/google-cloud-ops-agent/co>
    Process: 2704 ExecStart=/bin/true (code=exited, status=0/SUCCESS)
   Main PID: 2704 (code=exited, status=0/SUCCESS)

Feb 05 04:38:41 mongo-1 systemd[1]: Starting Google Cloud Ops Agent...
Feb 05 04:38:41 mongo-1 systemd[1]: Finished Google Cloud Ops Agent.
● google-cloud-ops-agent-fluent-bit.service - Google Cloud Ops Agent - Logging Agent
     Loaded: loaded (/lib/systemd/system/google-cloud-ops-agent-fluent-bit.service; static; vendor preset: enabled)
     Active: active (running) since Sun 2022-02-06 15:05:35 UTC; 6h ago
    Process: 22138 ExecStartPre=/opt/google-cloud-ops-agent/libexec/google_cloud_ops_agent_engine -service=fluentbit -in /etc/googl>
   Main PID: 22144 (fluent-bit)
      Tasks: 22 (limit: 2369)
     Memory: 29.0M
     CGroup: /system.slice/google-cloud-ops-agent-fluent-bit.service
             └─22144 /opt/google-cloud-ops-agent/subagents/fluent-bit/bin/fluent-bit --config /run/google-cloud-ops-agent-fluent-bi>

Feb 06 15:05:35 mongo-1 systemd[1]: google-cloud-ops-agent-fluent-bit.service: Scheduled restart job, restart counter is at 7.
Feb 06 15:05:35 mongo-1 systemd[1]: Stopped Google Cloud Ops Agent - Logging Agent.
Feb 06 15:05:35 mongo-1 systemd[1]: Starting Google Cloud Ops Agent - Logging Agent...
Feb 06 15:05:35 mongo-1 systemd[1]: Started Google Cloud Ops Agent - Logging Agent.
Feb 06 15:05:35 mongo-1 fluent-bit[22144]: Fluent Bit v1.8.12
Feb 06 15:05:35 mongo-1 fluent-bit[22144]: * Copyright (C) 2019-2021 The Fluent Bit Authors
Feb 06 15:05:35 mongo-1 fluent-bit[22144]: * Copyright (C) 2015-2018 Treasure Data
Feb 06 15:05:35 mongo-1 fluent-bit[22144]: * Fluent Bit is a CNCF sub-project under the umbrella of Fluentd
Feb 06 15:05:35 mongo-1 fluent-bit[22144]: * https://fluentbit.io

Upvotes: 1

Views: 2713

Answers (1)

Sam Stoelinga
Sam Stoelinga

Reputation: 5021

The issue was due to the VM not having a service account. The solution was to do the following:

  1. create a service account
  2. assign the service account Logs Writer and Monitoring Metric Writer roles
  3. Stop the VM, Edit the VM, set the newly created service account, start the VM

Note that by default a VM has a default service account. In my case I created the VM and explicitely didn't enable any service account hence the issue.

Upvotes: 2

Related Questions