Autoscaling AWS EMR cluster to 0 nodes

Question

Cross posting from: https://forums.aws.amazon.com/thread.jspa?messageID=766424

Hey,

Trying to apply this policy to a core instance group:

{
    "Constraints": {
        "MinCapacity": 0,
        "MaxCapacity": 2
    },
    "Rules": [
        {
            "Name": "ScaleUp",
            "Action": {
                "Market": "ON_DEMAND",
                "SimpleScalingPolicyConfiguration": {
                    "AdjustmentType": "EXACT_CAPACITY",
                    "ScalingAdjustment": 5,
                    "CoolDown": 300
                }
            },
            "Trigger": {
                "CloudWatchAlarmDefinition": {
                    "ComparisonOperator": "GREATER_THAN",
                    "MetricName": "AppsPending",
                    "Threshold": 0,
                    "Period": 300
                }
            }
        },
        {
            "Name": "ScaleDown",
            "Action": {
                "Market": "ON_DEMAND",
                "SimpleScalingPolicyConfiguration": {
                    "AdjustmentType": "EXACT_CAPACITY",
                    "ScalingAdjustment": 0,
                    "CoolDown": 300
                }
            },
            "Trigger": {
                "CloudWatchAlarmDefinition": {
                    "ComparisonOperator": "LESS_THAN_OR_EQUAL",
                    "MetricName": "AppsRunning",
                    "Threshold": 0,
                    "Period": 300
                }
            }
        }
    ]
}

But I'm getting this error:

An error occurred (ValidationException) when calling the PutAutoScalingPolicy operation: Auto Scaling constraint parameter minCapacity should be at least 1 for Core Instance Group.

I'm no expert in EMR but from the docs I thought this would be possible (I can create a master only cluster manually in the UI, why does this difference exist?). The master node is running a job on a cron schedule, when that kicks in it generates the job and then the AutoScaling fires up the core instances to process it, downscaling when the job is done.

Any suggestions?

Thanks, Alex

PS. To clarify the functional requirements, I'm trying to run a zeppelin dashboard service on master, have it kick off a batch job every 24h which will need a few nodes and then downscale back to 0 nodes the rest of the time. Happy to consider other suggestions to achieve this if I've got the wrong end of the stick.

Jonathan Kelly · Accepted Answer

It's true that you can start a single-node, master-only cluster without any core nodes, but this is a special kind of "cluster" that runs everything on the master. It is not possible to transition from a multi-node cluster to a single-node cluster or vice versa. Because of this, the core instance group has a minimum of 1 instance, even when using autoscaling.

Autoscaling AWS EMR cluster to 0 nodes

Answers (2)

Related Questions