user2103635
user2103635

Reputation: 11

apache spark deploying on amazon emr

My command:

aws emr add-steps --cluster-id j-10366NQM2PJDC --steps Type=spark,Name=SparkWordCountApp,Args=[--deploy-mode,cluster,--master,yarn,--conf,spark.yarn.submit.waitAppCompletion=false,--num-executors,5,--executor-cores,5,--executor-memory,20g,s3://wordCount.py,s3://input-bucket/inputFile.txt,s3://output-bucket/],ActionOnFailure=CONTINUE

An error occurred (AccessDeniedException) when calling the AddJobFlowSteps operation: User: arn:aws:sts::503059920414:assumed-role/EMR_EC2_DefaultRole/i-052a3cd61db3879d9 is not authorized to perform: elasticmapreduce:AddJobFlowSteps on resource: arn:aws:elasticmapreduce:us-east-2:503059920414:cluster/j-10366NQM2PJDC

Can somebody have any pointers on this? I am new to spark.

Upvotes: 1

Views: 1510

Answers (1)

jc mannem
jc mannem

Reputation: 2343

Looks like you are using your Instance profile role EMR_EC2_DefaultRole to make calls to EMR to add STEPS to a cluster. The error means, your instance profile role's policy do not allow access to do this Action of elasticmapreduce:AddJobFlowSteps. In fact the default managed policy attached to this role AmazonElasticMapReduceforEC2Role will not have this access.

So, you will need to add a policy to your role EMR_EC2_DefaultRole to allow elasticmapreduce:AddJobFlowSteps Action.

Upvotes: 0

Related Questions