CorribView
CorribView

Reputation: 741

Add EC2 Action to Cloudwatch Alarm on health check

I have an AWS bitnami instance that I've created a Route 53 health check alarm when the website it becomes unavailable. The first action triggers successfully and sends me an email. However, I would also like for the instance to be rebooted, although the Add EC2 action is greyed out and it reads: "This action is only available for EC2 Per-Instance Metrics". How can I add this?

enter image description here

Could it be related to this: My EC2 instance exists in the Ireland AZ, but and when I create the alarm and SNS topic on the health check in Route 53 it automatically creates them in the N.Virginia AZ. And I don't appear to have access to changing where this goes.

Upvotes: 4

Views: 1979

Answers (1)

Zdenek F
Zdenek F

Reputation: 1899

Original solution:

Your best course of action is to use the CloudWatch Events.

You can create a rule which will match your alarm on a CloudWatch Alarm State Change event and run an EC2 instance reboot API call:

enter image description here

The event pattern looks like this:

{
  "source": [
    "aws.cloudwatch"
  ],
  "detail-type": [
    "CloudWatch Alarm State Change"
  ],
  "detail": {
    "alarmName": ["YOUR_ROUTE53_ALARM_NAME"],
    "previousState": {
        "value": ["OK"]
    },
    "state": {
        "value": ["ALARM"]
    }
  }
}

The pattern syntax is a bit weird that you have to wrap a single string into an array. This pattern will match an alarm state change for the alarm called YOUR_ROUTE53_ALARM_NAME which previous state was OK and current state is ALARM.

I added the previous state to the match because I don't know if the alarm will trigger multiple times. Without it you might end up in a loop of infinite reboots IMHO.

Regarding the targets, I'd let the CloudWatch Events create the new role for you.


Updated solution: (the questioner requires a separate stop and start calls because of reasons)

You'd still use the CloudWatch Events (CWE) for detecting the change in the alarm.

Then you've got two options:

  1. use a lambda to handle the separate stop/start which I'd recommend:

    1. create a NodeJS 12 Lambda (every Node lambda has an AWS JS SDK available), the functions you would use are in AWS.EC2 class
    2. call stopInstances for your instance, the instance state will change to stopping
    3. use waitFor to listen for the instance state changing to stopped
    4. call startInstances to start it up again You need to make sure your Lambda has necessary IAM permissions to be able to restart EC2 instance.
  2. create two CWE rules

    1. first rule detects the alarm and targets EC2 StopInstances API call (same as in my original solution, just a slightly different target)
    2. the second rule matches on that instance state change and targets EC2 StartInstances API call.

      The state change notification looks like this:

      {
         "id":"7bf73129-1428-4cd3-a780-95db273d1602",
         "detail-type":"EC2 Instance State-change Notification",
         "source":"aws.ec2",
         "account":"123456789012",
         "time":"2015-11-11T21:29:54Z",
         "region":"us-east-1",
         "resources":[
            "arn:aws:ec2:us-east-1:123456789012:instance/i-abcd1111"
         ],
         "detail":{
            "instance-id":"i-abcd1111",
            "state":"stopped"
        }
      }
      

      The event pattern to match that notification is simple: enter image description here

      The problem with this solution is the state-change notification doesn't have any additional fields besides the state and instance-id. There is no way how you can distinguish an first-rule-triggered shutdown and a normal shutdown. Every shutdown would trigger this rule and start the instance again.

      If you wanted to shut down your instance manually, you'd have to disable the second CWE rule (a rule can be enabled/disabled) so it doesn't trigger the start. Which might be a reasonable trade-off for you.

Btw, I'd say there is something fishy going on with your instance if the EC2 reboot is not enough.

Upvotes: 5

Related Questions