Saar peer
Saar peer

Reputation: 847

AWS - SSM Agent on Instances: [<>] are not functioning

I followed this tutorial in order to execute a shell command before an instance is terminated by ASG.

But I keep getting this error when the SSM is trying to invoke the script

Step timed out while step is verifying the SSM Agent availability on the target instance(s). SSM Agent on Instances: [i-07b0850b2f3ced30c] are not functioning. Please refer to Automation Service Troubleshooting Guide for more diagnosis details.

What am I missing? This is because the SSM agent is stopping? is this related to permissions?

This is automation I am using:

description: 'This document will disjoin instances From an Active Directory, create an AMI of the instance, send a signal to LifeCycleHook to terminate the instance'
schemaVersion: '0.3'
assumeRole: '{{automationAssumeRole}}'
parameters:
  automationAssumeRole:
    default: 'arn:aws:iam::012345678901:role/automationAssumeRole'
    description: (Required) The ARN of the role that allows automation to perform the actions on your behalf.
    type: String
  ASGName:
    default: My_AutoScalingGroup
    type: String
  InstanceId:
    type: String
  LCHName:
    default: my-lifecycle-hook
    type: String
mainSteps:
  - inputs:
      DocumentName: AWS-RunShellScript
      InstanceIds:
        - '{{ InstanceId }}'
      TimeoutSeconds: 3600
      Parameters:
        commands: ifconfig
        executionTimeout: '7200'
    name: DoSomething
    action: 'aws:runCommand'
    onFailure: 'step:TerminateTheInstance'
  - inputs:
      LifecycleHookName: '{{ LCHName }}'
      InstanceId: '{{ InstanceId }}'
      AutoScalingGroupName: '{{ ASGName }}'
      Service: autoscaling
      Api: CompleteLifecycleAction
      LifecycleActionResult: CONTINUE
    name: TerminateTheInstance
    action: 'aws:executeAwsApi'

Upvotes: 3

Views: 9331

Answers (3)

Filip
Filip

Reputation: 1

I've had similar problem, whats woreked for me was adding "AmazonSSMFullAccess" policy to role of instance.

Upvotes: 0

AditYa
AditYa

Reputation: 907

I had the same error and fixed it with the below troubleshooting steps.

  1. Check the IAM role permission, the instance should have the "AmazonSSMFullAccess" policy attached.
  2. Does the instance has security groups attached with HTTPS(443) port on inbound rules. SSM agent uses HTTPS ports to work with instances.
  3. Check is SSM agent is running on the instance or not. if the SSM agent is not running, use the below systems manager document to start the SSM agent(if it’s a Linux instance use shell commands/script).

Document to start SSM agent in windows instance:

{
  "schemaVersion": "2.0",
  "description": "Start SSM agent on instance",
  "mainSteps": [
    {
      "action": "aws:runPowerShellScript",
      "name": "runPowerShellScript",
      "inputs": {
        "runCommand": [
          "Start-Service AmazonSSMAgent"
        ]
      }
    }
  ]
}

Hope these steps helped you, thank you.

Upvotes: 0

Justin Thomas
Justin Thomas

Reputation: 54

Is the Instance already managed while running the document?It should be.. This error suggest that the ssm agent is not active on the Instance and hence the command is not delivered.

I wouldn't expect the ssm agent to stop because of scale down.. because instance is in terminating:wait state due to lifecycle hooks.

Upvotes: 0

Related Questions