Reputation: 5021
Trying to get Ops Agent working and used the following command to install it:
curl -sSO https://dl.google.com/cloudagents/add-google-cloud-ops-agent-repo.sh
sudo bash add-google-cloud-ops-agent-repo.sh --also-install
I see the following error in the logs of journalctl -u google-cloud-ops-agent-opentelemetry-collector -xn
otelopscol[2706]: 2022-02-06T21:50:36.140Z info exporterhelper/queued_retry.go:215 Exporting failed. Will retry the request after interval. {"kind": "exporter", "name": "googlecloud", "error": "[rpc error: code = Unauthenticated desc = transport: per-RPC creds failed due to error: metadata: GCE metadata \"instance/service-accounts/default/token?scopes=https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fcloud-platform%2Chttps%3A%2F%2Fwww.googleapis.com%2Fauth%2Fmonitoring%2Chttps%3A%2F%2Fwww.googleapis.com%2Fauth%2Fmonitoring.read%2Chttps%3A%2F%2Fwww.googleapis.com%2Fauth%2Fmonitoring.write\" not defined; rpc error: code = Unauthenticated desc = transport: per-RPC creds failed due to error: metadata: GCE metadata \"instance/service-accounts/default/token?scopes=https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fcloud-platform%2Chttps%3A%2F%2Fwww.googleapis.com%2Fauth%2Fmonitoring%2Chttps%3A%2F%2Fwww.googleapis.com%2Fauth%2Fmonitoring.read%2Chttps%3A%2F%2Fwww.googleapis.com%2Fauth%2Fmonitoring.write\" not defined; rpc error: code = Unauthenticated desc = transport: per-RPC creds failed due to error: metadata: GCE metadata \"instance/service-accounts/default/token?scopes=https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fcloud-platform%2Chttps%3A%2F%2Fwww.googleapis.com%2Fauth%2Fmonitoring%2Chttps%3A%2F%2Fwww.googleapis.com%2Fauth%2Fmonitoring.read%2Chttps%3A%2F%2Fwww.googleapis.com%2Fauth%2Fmonitoring.write\" not defined]", "interval": "14.115202828s"}
The services otherwise look good and are running but the UI reports that Ops Agent wasn't actually running which I suspect is due to no data being sent back.
Here is the status of running agents:
google-cloud-ops-agent-opentelemetry-collector.service - Google Cloud Ops Agent - Metrics Agent
Loaded: loaded (/lib/systemd/system/google-cloud-ops-agent-opentelemetry-collector.service; static; vendor preset: enabled)
Active: active (running) since Sat 2022-02-05 04:38:41 UTC; 1 day 17h ago
Process: 2690 ExecStartPre=/opt/google-cloud-ops-agent/libexec/google_cloud_ops_agent_engine -service=otel -in /etc/google-clou>
Main PID: 2706 (otelopscol)
Tasks: 9 (limit: 2369)
Memory: 193.6M
CGroup: /system.slice/google-cloud-ops-agent-opentelemetry-collector.service
└─2706 /opt/google-cloud-ops-agent/subagents/opentelemetry-collector/otelopscol --config=/run/google-cloud-ops-agent-o>
Feb 06 21:55:53 mongo-1 otelopscol[2706]: /root/go/pkg/mod/go.opentelemetry.io/[email protected]/exporter/exporterhelper/qu>
Feb 06 21:55:53 mongo-1 otelopscol[2706]: go.opentelemetry.io/collector/exporter/exporterhelper.(*metricsSenderWithObservability).s>
Feb 06 21:55:53 mongo-1 otelopscol[2706]: /root/go/pkg/mod/go.opentelemetry.io/[email protected]/exporter/exporterhelper/me>
Feb 06 21:55:53 mongo-1 otelopscol[2706]: go.opentelemetry.io/collector/exporter/exporterhelper.(*queuedRetrySender).start.func1
Feb 06 21:55:53 mongo-1 otelopscol[2706]: /root/go/pkg/mod/go.opentelemetry.io/[email protected]/exporter/exporterhelper/qu>
Feb 06 21:55:53 mongo-1 otelopscol[2706]: go.opentelemetry.io/collector/exporter/exporterhelper/internal.consumerFunc.consume
Feb 06 21:55:53 mongo-1 otelopscol[2706]: /root/go/pkg/mod/go.opentelemetry.io/[email protected]/exporter/exporterhelper/in>
Feb 06 21:55:53 mongo-1 otelopscol[2706]: go.opentelemetry.io/collector/exporter/exporterhelper/internal.(*boundedMemoryQueue).Star>
Feb 06 21:55:53 mongo-1 otelopscol[2706]: /root/go/pkg/mod/go.opentelemetry.io/[email protected]/exporter/exporterhelper/in>
Feb 06 21:55:53 mongo-1 otelopscol[2706]: 2022-02-06T21:55:53.145Z info exporterhelper/queued_retry.go:215 Exp>
● google-cloud-ops-agent.service - Google Cloud Ops Agent
Loaded: loaded (/lib/systemd/system/google-cloud-ops-agent.service; enabled; vendor preset: enabled)
Active: active (exited) since Sat 2022-02-05 04:38:41 UTC; 1 day 17h ago
Process: 2691 ExecStartPre=/opt/google-cloud-ops-agent/libexec/google_cloud_ops_agent_engine -in /etc/google-cloud-ops-agent/co>
Process: 2704 ExecStart=/bin/true (code=exited, status=0/SUCCESS)
Main PID: 2704 (code=exited, status=0/SUCCESS)
Feb 05 04:38:41 mongo-1 systemd[1]: Starting Google Cloud Ops Agent...
Feb 05 04:38:41 mongo-1 systemd[1]: Finished Google Cloud Ops Agent.
● google-cloud-ops-agent-fluent-bit.service - Google Cloud Ops Agent - Logging Agent
Loaded: loaded (/lib/systemd/system/google-cloud-ops-agent-fluent-bit.service; static; vendor preset: enabled)
Active: active (running) since Sun 2022-02-06 15:05:35 UTC; 6h ago
Process: 22138 ExecStartPre=/opt/google-cloud-ops-agent/libexec/google_cloud_ops_agent_engine -service=fluentbit -in /etc/googl>
Main PID: 22144 (fluent-bit)
Tasks: 22 (limit: 2369)
Memory: 29.0M
CGroup: /system.slice/google-cloud-ops-agent-fluent-bit.service
└─22144 /opt/google-cloud-ops-agent/subagents/fluent-bit/bin/fluent-bit --config /run/google-cloud-ops-agent-fluent-bi>
Feb 06 15:05:35 mongo-1 systemd[1]: google-cloud-ops-agent-fluent-bit.service: Scheduled restart job, restart counter is at 7.
Feb 06 15:05:35 mongo-1 systemd[1]: Stopped Google Cloud Ops Agent - Logging Agent.
Feb 06 15:05:35 mongo-1 systemd[1]: Starting Google Cloud Ops Agent - Logging Agent...
Feb 06 15:05:35 mongo-1 systemd[1]: Started Google Cloud Ops Agent - Logging Agent.
Feb 06 15:05:35 mongo-1 fluent-bit[22144]: Fluent Bit v1.8.12
Feb 06 15:05:35 mongo-1 fluent-bit[22144]: * Copyright (C) 2019-2021 The Fluent Bit Authors
Feb 06 15:05:35 mongo-1 fluent-bit[22144]: * Copyright (C) 2015-2018 Treasure Data
Feb 06 15:05:35 mongo-1 fluent-bit[22144]: * Fluent Bit is a CNCF sub-project under the umbrella of Fluentd
Feb 06 15:05:35 mongo-1 fluent-bit[22144]: * https://fluentbit.io
Upvotes: 1
Views: 2713
Reputation: 5021
The issue was due to the VM not having a service account. The solution was to do the following:
Note that by default a VM has a default service account. In my case I created the VM and explicitely didn't enable any service account hence the issue.
Upvotes: 2