Alex Latchford
Alex Latchford

Reputation: 655

Autoscaling AWS EMR cluster to 0 nodes

Cross posting from: https://forums.aws.amazon.com/thread.jspa?messageID=766424

Hey,

Trying to apply this policy to a core instance group:

{
    "Constraints": {
        "MinCapacity": 0,
        "MaxCapacity": 2
    },
    "Rules": [
        {
            "Name": "ScaleUp",
            "Action": {
                "Market": "ON_DEMAND",
                "SimpleScalingPolicyConfiguration": {
                    "AdjustmentType": "EXACT_CAPACITY",
                    "ScalingAdjustment": 5,
                    "CoolDown": 300
                }
            },
            "Trigger": {
                "CloudWatchAlarmDefinition": {
                    "ComparisonOperator": "GREATER_THAN",
                    "MetricName": "AppsPending",
                    "Threshold": 0,
                    "Period": 300
                }
            }
        },
        {
            "Name": "ScaleDown",
            "Action": {
                "Market": "ON_DEMAND",
                "SimpleScalingPolicyConfiguration": {
                    "AdjustmentType": "EXACT_CAPACITY",
                    "ScalingAdjustment": 0,
                    "CoolDown": 300
                }
            },
            "Trigger": {
                "CloudWatchAlarmDefinition": {
                    "ComparisonOperator": "LESS_THAN_OR_EQUAL",
                    "MetricName": "AppsRunning",
                    "Threshold": 0,
                    "Period": 300
                }
            }
        }
    ]
}

But I'm getting this error:

An error occurred (ValidationException) when calling the PutAutoScalingPolicy operation: Auto Scaling constraint parameter minCapacity should be at least 1 for Core Instance Group.

I'm no expert in EMR but from the docs I thought this would be possible (I can create a master only cluster manually in the UI, why does this difference exist?). The master node is running a job on a cron schedule, when that kicks in it generates the job and then the AutoScaling fires up the core instances to process it, downscaling when the job is done.

Any suggestions?

Thanks, Alex

PS. To clarify the functional requirements, I'm trying to run a zeppelin dashboard service on master, have it kick off a batch job every 24h which will need a few nodes and then downscale back to 0 nodes the rest of the time. Happy to consider other suggestions to achieve this if I've got the wrong end of the stick.

Upvotes: 1

Views: 2393

Answers (2)

Hiranya Deka
Hiranya Deka

Reputation: 241

Single node cluster is not scalable. You need to have at least one core nodes along with the master node. So while applying scaling policy minimum number of core nodes should be 1.

Please find the screenshot from AWS document:enter image description here

Please refer to link for more details: https://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-scale-on-demand.html

Upvotes: 0

Jonathan Kelly
Jonathan Kelly

Reputation: 1990

It's true that you can start a single-node, master-only cluster without any core nodes, but this is a special kind of "cluster" that runs everything on the master. It is not possible to transition from a multi-node cluster to a single-node cluster or vice versa. Because of this, the core instance group has a minimum of 1 instance, even when using autoscaling.

Upvotes: 4

Related Questions