wibwaj
wibwaj

Reputation: 113

Cannot SSH Google Cloud VM instance after VM restart

I am using Google Cloud Platform and connect to my VM instance through the Google Cloud Console. Restarted the VM without reserving static IP therefore upon VM restart the ephemeral IP changed. The reason I restarted the VM was because I noticed the CPU utilization was at constant 100% which I figured was not the CPU of my local VM instance (Ubuntu 16.x) but the Google shared container CPU utilization. But it was not allowing me to SSH in to my VM instance so I thought a restart might help.

VM restart did help but the IP changed :( I run Apache and Nginx servers so I had to manually update the new IP in the respective configuration files in order for my apps to run. Since the VM restart I have been experiencing trouble connecting to VM instance via SSH.

Firewall rules - OK (set to allow port 22) .ssh/sshd_conf - OK (RSAauth yes) GCE VM SSH Key - OK (public key for user is saved)

I tried the following steps to resolve the issue but in vain

  1. Removed SSH key pairs from metadata and SSH keys and regenerated new public key using puttyGen
  2. Verified key formatting of puttyGen and ensured the accurate public key was saved in the Google VM instance SSH keys section
  3. When I noticed that /etc/ssh/authorized_keys was empty I reinitialized using gcloud init which took care of the oAuth part but this did not resolve the issue
  4. I tried the gcloud command on my local Google Cloud SDK shell but it keeps throwing the error server refused key

Finally, here the tracelog from /var/log/syslog

Sep 25 22:30:01 cs-6000-devshell-vm-c72ffc0b-5c39-48a6-854c-fce64f031c54-41 CRON[1746]: (root) CMD (/google/scripts/gcloud_docker_auth.sh)
Sep 25 22:33:19 cs-6000-devshell-vm-c72ffc0b-5c39-48a6-854c-fce64f031c54-41 supervisord: credentials-service INFO:root:Proxying devshell request, attempt (1 of 3)
Sep 25 22:33:19 cs-6000-devshell-vm-c72ffc0b-5c39-48a6-854c-fce64f031c54-41 supervisord: credentials-service INFO:root:Connecting to DEVSHELL_CLIENT_PORT 40159
Sep 25 22:33:19 cs-6000-devshell-vm-c72ffc0b-5c39-48a6-854c-fce64f031c54-41 supervisord: credentials-service INFO:root:writing to devshell 4 bytes
Sep 25 22:33:19 cs-6000-devshell-vm-c72ffc0b-5c39-48a6-854c-fce64f031c54-41 supervisord: credentials-service INFO:root:read from devshell 293 bytes
Sep 25 22:33:19 cs-6000-devshell-vm-c72ffc0b-5c39-48a6-854c-fce64f031c54-41 supervisord: credentials-service INFO:root:Closing devshell forwarding connection.
Sep 25 22:33:19 cs-6000-devshell-vm-c72ffc0b-5c39-48a6-854c-fce64f031c54-41 supervisord: credentials-service INFO:root:Closing client connection.
Sep 25 22:35:01 cs-6000-devshell-vm-c72ffc0b-5c39-48a6-854c-fce64f031c54-41 CRON[1774]: (root) CMD (/google/scripts/gcloud_docker_auth.sh)
Sep 25 22:37:10 cs-6000-devshell-vm-c72ffc0b-5c39-48a6-854c-fce64f031c54-41 supervisord: control-command-service INFO:root:saw no newline in the first 6 bytes Retrying...(1$
Sep 25 22:37:14 cs-6000-devshell-vm-c72ffc0b-5c39-48a6-854c-fce64f031c54-41 supervisord: control-command-service INFO:root:Error, could not connect to devshell. Retrying...$
Sep 25 22:37:22 cs-6000-devshell-vm-c72ffc0b-5c39-48a6-854c-fce64f031c54-41 supervisord: control-command-service ERROR:root:Error, could not connect to devshell. Giving up.
Sep 25 22:37:22 cs-6000-devshell-vm-c72ffc0b-5c39-48a6-854c-fce64f031c54-41 supervisord: control-command-service Traceback (most recent call last):
Sep 25 22:37:22 cs-6000-devshell-vm-c72ffc0b-5c39-48a6-854c-fce64f031c54-41 supervisord: control-command-service   File "/google/credentials/control_server.py", line 110, i$
Sep 25 22:37:22 cs-6000-devshell-vm-c72ffc0b-5c39-48a6-854c-fce64f031c54-41 supervisord: control-command-service     self.hanging_socket.connect(('localhost', self.server_p$
Sep 25 22:37:22 cs-6000-devshell-vm-c72ffc0b-5c39-48a6-854c-fce64f031c54-41 supervisord: control-command-service   File "/usr/lib/python2.7/socket.py", line 224, in meth
Sep 25 22:37:22 cs-6000-devshell-vm-c72ffc0b-5c39-48a6-854c-fce64f031c54-41 supervisord: control-command-service     return getattr(self._sock,name)(*args)
Sep 25 22:37:22 cs-6000-devshell-vm-c72ffc0b-5c39-48a6-854c-fce64f031c54-41 supervisord: control-command-service error: [Errno 111] Connection refused
Sep 25 22:37:22 cs-6000-devshell-vm-c72ffc0b-5c39-48a6-854c-fce64f031c54-41 supervisord: 2017-09-25 22:37:22,640 INFO exited: control-command-service (exit status 0; expect$
Sep 25 22:37:23 cs-6000-devshell-vm-c72ffc0b-5c39-48a6-854c-fce64f031c54-41 supervisord: 2017-09-25 22:37:23,642 INFO spawned: 'control-command-service' with pid 1801
Sep 25 22:37:23 cs-6000-devshell-vm-c72ffc0b-5c39-48a6-854c-fce64f031c54-41 supervisord: control-command-service INFO:root:Error, could not connect to devshell. Retrying...$
Sep 25 22:37:23 cs-6000-devshell-vm-c72ffc0b-5c39-48a6-854c-fce64f031c54-41 supervisord: control-command-service INFO:root:Error, could not connect to devshell. Retrying...$
Sep 25 22:37:24 cs-6000-devshell-vm-c72ffc0b-5c39-48a6-854c-fce64f031c54-41 supervisord: 2017-09-25 22:37:24,705 INFO success: control-command-service entered RUNNING state$
Sep 25 22:37:27 cs-6000-devshell-vm-c72ffc0b-5c39-48a6-854c-fce64f031c54-41 supervisord: control-command-service INFO:root:Error, could not connect to devshell. Retrying...$
Sep 25 22:37:27 cs-6000-devshell-vm-c72ffc0b-5c39-48a6-854c-fce64f031c54-41 supervisord: control-command-service INFO:root:Error, could not connect to devshell. Retrying...$
Sep 25 22:37:34 cs-6000-devshell-vm-c72ffc0b-5c39-48a6-854c-fce64f031c54-41 supervisord: control-command-service INFO:root:Executing health check.

Upvotes: 3

Views: 5339

Answers (2)

Alioua
Alioua

Reputation: 1776

Your issue may be the Guest environment.

  1. Go to the VM instances page in Google Cloud Platform console.
  2. Click on the instance for which you want to add a startup script.
  3. Click the Edit button at the top of the page.
  4. Click on ‘Enable connecting to serial ports’
  5. Under Custom metadata, click Add item.
  6. Set 'Key' to 'startup-script' and set 'Value' to this script:

{#! /bin/bash useradd -G sudo USERNAME echo 'USERNAME:PASSWORD' | chpasswd}

  1. Click Save and then click RESET on the top of the page. You might need to wait for some time for the instance to reboot.
  2. Click on 'Connect to serial port' in the page.
  3. In the new window, you might need to wait a bit and press on Enter of your keyboard once; then, you should see the login prompt. 10.. Login using the USERNAME and PASSWORD you provided.

Then inside the instance you need to fetch which is not working by Validate the Guest Environment :

First: look in your serial console if these line below are listed :

  • Started Google Compute Engine Accounts Daemon
  • Started Google Compute Engine IP Forwarding Daemon
  • Started Google Compute Engine Clock Skew Daemon
  • Started Google Compute Engine Instance Setup
  • Started Google Compute Engine Startup Scripts
  • Started Google Compute Engine Shutdown Scripts
  • Started Google Compute Engine Network Setup

Second: Verify if the package for the guest Environment is installed run the command in your serial output

apt list --installed | grep google-compute

It should list the below line : - google-compute-engine - google-compute-engine-oslogin - python-google-compute-engine - python3-google-compute-engine

Third: you need to verify if all the services for the guest environment are running by running this command :

sudo systemctl list-unit-files | grep google | grep enabled

It should list the below line :

  • google-accounts-daemon.service enabled
  • google-ip-forwarding-daemon.service enabled
  • google-clock-skew-daemon.service enabled
  • google-instance-setup.service enabled
  • google-shutdown-scripts.service enabled
  • google-startup-scripts.service enabled
  • google-network-setup.service enabled

If sometimes different according to above you may need to restart the service or installed the Guest environment

Upvotes: 1

PeterB
PeterB

Reputation: 121

I had similar issue, and in my case the root cause was that my VM was that configured SSH keys do not survive restart of VM (they disappear from VM configuration once the VM instance is restarted).

Not quite sure what is the real reason for this, but my humble theory is that SSH kyes are by default stored directly on boot disk (not on persistent volume), and in my case VM has been configured with Delete boot disk when instance is deleted opetion enabled, I'm guessing this feature has been somehow triggered after the restart, meaning that SSH keys has been lost with deletion of boot disk.

Upvotes: 0

Related Questions