bwighthunter
bwighthunter

Reputation: 137

AWS Cloud Formation error: ElasticMapReduce Cluster failed to stabilize

I have been getting this error consistently despite my research telling me that this is an internal-to-Amazon error. I have no idea where to start with this error, or if there is even anything that I can do to help it.

The fact that I have been getting it consistently makes me think that it is something wrong with my script. Here it is:

{
  "Description": "Demo pipeline.",
  "Resources": {
    "s3Demo": {
        "Type" : "AWS::S3::Bucket",
        "Properties" : {
            "BucketName" : "example-dna-demo"
        }
    },

    "s3Access": {
        "Type": "AWS::IAM::Role",
        "Properties": {
            "ManagedPolicyArns": [
                "arn:aws:iam::aws:policy/AmazonS3FullAccess"
            ],
            "AssumeRolePolicyDocument": {
                "Version": "2012-10-17",
                "Statement": [{
                    "Effect": "Allow",
                    "Action": "sts:AssumeRole",
                    "Principal":{
                        "Service": "firehose.amazonaws.com"
                    }
                }]
            }, 
            "RoleName": "kinesisS3Access"
        },
        "DependsOn": "s3Demo"
    },

    "kinesisDemo": {
        "Type": "AWS::KinesisFirehose::DeliveryStream",
        "Properties": {
            "DeliveryStreamName": "Demo-Stream",
            "S3DestinationConfiguration": {
                "BucketARN" : "arn:aws:s3:::example-dna-demo",
                "BufferingHints" : {
                    "IntervalInSeconds" : 300,
                    "SizeInMBs" : 5
                },
                "CompressionFormat" : "UNCOMPRESSED",
                "Prefix" : "twitter",
                "RoleARN" : { "Fn::GetAtt": [ "s3Access", "Arn" ]}
            }
        },
        "DependsOn": "s3Access"
    },

    "S3LambdaAccess":{
        "Type": "AWS::IAM::Role",
        "Properties": {
            "ManagedPolicyArns": [
                "arn:aws:iam::aws:policy/AmazonS3ReadOnlyAccess"
            ],
            "AssumeRolePolicyDocument": {
                "Version": "2012-10-17",
                "Statement": [{
                    "Effect": "Allow",
                    "Action": "sts:AssumeRole",
                    "Principal":{
                        "Service": "lambda.amazonaws.com"
                    }
                }]
            }, 
            "RoleName": "lambdaS3Access"
        }
    },
    "LambdaDemo": {
        "Type" : "AWS::Lambda::Function",
        "Properties" : {
            "Code" : {
                "S3Bucket" : "example-dna-cloud-formation",
                "S3Key" : "lambda_function.py.zip"
            },
            "Description" : "Looks for S3 writes and loads them into another resource",
            "FunctionName" : "DemoLambdaFunction",
            "Handler" : "lambda-handler",
            "Role" : { "Fn::GetAtt": [ "S3LambdaAccess", "Arn" ]},
            "Runtime" : "python2.7"
        },
        "DependsOn": "S3LambdaAccess"
    },
    "EMRClusterJobFlowRole": {
        "Type": "AWS::IAM::Role",
        "Properties": {
            "AssumeRolePolicyDocument": {  
                "Version": "2012-10-17",
                "Statement": [{
                    "Effect": "Allow",
                    "Action": "sts:AssumeRole",
                    "Principal":{
                        "Service": "ec2.amazonaws.com"
                    }
                }] 
            },
            "RoleName": "ClusterRole"
        }
    },
    "EMRServiceRole": {
        "Type": "AWS::IAM::Role",
        "Properties": {
            "AssumeRolePolicyDocument": { 
              "Version": "2012-10-17",
                "Statement": [{
                    "Effect": "Allow",
                    "Action": "sts:AssumeRole",
                    "Principal":{
                        "Service": "ec2.amazonaws.com"
                    }
                }]
            },
            "RoleName": "EC2InstanceRole"
        }
    },
    "EMR":{
        "Type" : "AWS::EMR::Cluster",
        "Properties" : {
            "Applications": [
                {
                    "Name" : "Spark"
                }
            ],
            "ReleaseLabel": "emr-5.0.0",
            "Instances" : {
                "CoreInstanceGroup" : {
                    "BidPrice": 0.06,
                    "InstanceCount" : 1,
                    "InstanceType" : "m4.large",
                    "Market": "SPOT"
                    },
                "MasterInstanceGroup" : {
                    "BidPrice": 0.06,
                    "InstanceCount" : 1,
                    "InstanceType" : "m4.large",
                    "Market": "SPOT"
                    }
                },
            "JobFlowRole" : "EMRClusterJobFlowRole",
            "Name" : "DemoEMR",
            "ServiceRole" : "EMRServiceRole",
            "LogUri":"s3://toyota-dna-cloud-formation/cf-logging"
        },
        "DependsOn": ["EMRServiceRole", "EMRServiceRole"]
    }
  }
}

I imagine that you probably couldn't run it because I have a lambda function getting code from an S3 bucket, which I've changed the name of here. I am just learning cloud formation scripts, and I know there is a lot of stuff that I am not doing here, but I just want to build a small thing that works, and then fill it out a little more.

I know that my script worked up until the two IAM Roles and the EMR cluster. Thanks in advance.

EDIT: I specified recent instance versions and chose a ReleaseLabel property. with no luck. Same error.

Upvotes: 1

Views: 2410

Answers (3)

Anton Kraievyi
Anton Kraievyi

Reputation: 4342

In my case that was due to missing autoscaling role, called EMR_AutoScaling_DefaultRole.

Once I got it in place via aws emr create-default-roles my cloudformation stack once again started deploying nicely (it was deploying okay just before I added autoscaling stuff in).

Upvotes: 4

bwighthunter
bwighthunter

Reputation: 137

So it turns out that there was no default VPC in the region I was running the script in, and that is the reason that my EMR cluster was failing to stabilize.

When I tried running it in a different region, it worked, but because that region DID have a default VPC.

Upvotes: 1

Kunal
Kunal

Reputation: 46

It could be that your account has reached the EC2 limit in the region you are trying to deploy to. Have you tried a different region?

Upvotes: 2

Related Questions