Reputation: 450
I am currently running a process on an ec2 server that needs to run consistently in the background. I tried to login to the server and I continue to get a Network Error: Connection timed out prompt. When I check the instance, I get the following message:
Instance reachability check failed at February 22, 2020 at 11:15:00 PM UTC-5 (1 days, 13 hours and 34 minutes ago)
To troubleshoot, I have tried rebooting the server but that did not correct the problem. How do I correct this and also prevent it from happening again?
Upvotes: 6
Views: 16536
Reputation: 15451
I've gone through the same problem
and then once looking at the EC2 dashboard could see that something wasn't right with it
but for me rebooting
and waiting for a 2-3 minutes solved it and then was able to SSH to the instance just fine
If that becomes a recurrent problem, then I'll follow through with Jeremy Thompson's advice
... put the EC2's in an Auto Scaling Group. The ALB does a health check and it fails will no longer route traffic to that EC2, then the ASG will send a Status check and take the unresponding server out of rotation.
Upvotes: 4
Reputation: 743
An instance status check failure indicates a problem with the instance, such as:
You can check following for troubleshooting https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/TroubleshootingInstancesStopping.html
For future reprting and auto recovery you can create a CloudWatch Alarm
For second part
Nothing you can do to stop its occurrence, but for up-time and availability YES you can create another EC2 and add ALB on the top of both instances which checks the health of instance, so that your users/customers/service might be available during recovery time (from second instance). You can increase number of instances as more as you want for high availability (obviously it involves cost)
Upvotes: 7