Mehaboob Khan
Mehaboob Khan

Reputation: 353

Unable to create Aws Emr Cluster using Java Sdk

I was starting an AWS ERM cluster using Java SDK (below is a code snippet) which worked perfectly fine.

BasicAWSCredentials awsCreds = new BasicAWSCredentials(accessKeyId, secretAccessKeyId);
AmazonElasticMapReduce emrClient = AmazonElasticMapReduceClientBuilder.standard()
                    .withCredentials(new AWSStaticCredentialsProvider(awsCreds))
                    .withRegion(region)
                    .build();

JobFlowInstancesConfig jobFlowInstanceConfig = new JobFlowInstancesConfig()
                .withEc2SubnetId("subnetId")
                .withEc2KeyName("ec2KeyName") 
                .withInstanceCount(3) 
                .withKeepJobFlowAliveWhenNoSteps(true)    
                .withMasterInstanceType(c5.4xlarge)
                .withSlaveInstanceType(c5.4xlarge); 


        // create the cluster
        RunJobFlowRequest request = new RunJobFlowRequest()
                .withName("clusterName")
                .withReleaseLabel("emr-5.23.0")
                .withApplications("<Added following in applications Hadoop,Spark,Ganglia,Zeppelin>")
                .withLogUri("s3 path")
                .withServiceRole("EMR_DefaultRole")
                .withJobFlowRole("EMR_EC2_DefaultRole")
                .withInstances(jobFlowInstanceConfig);

RunJobFlowResult runJobFlowResult = emrClient.runJobFlow(request); 

Later on in another AWS Environment, a Role was created by our AWS Team to create cluster from a particular EC2 instance. But I am unable to create a cluster. Below is the code snippet with additional configuration, and changes that I notice with respect to my previous configuration.

  1. No accessKeyId and secretAccessKeyId
  2. EMR_EC2_DefaultRole changes to configured Role
  3. Security Configuration was added

    AmazonElasticMapReduce emrClient = AmazonElasticMapReduceClientBuilder.standard()
                    .withRegion(region)
                    .build();
    
    JobFlowInstancesConfig jobFlowInstanceConfig = new JobFlowInstancesConfig()
                .withEc2SubnetId("subnetId")
                .withEc2KeyName("ec2KeyName") 
                .withInstanceCount(3) 
                .withKeepJobFlowAliveWhenNoSteps(true)    
                .withMasterInstanceType(c5.4xlarge)
                .withSlaveInstanceType(c5.4xlarge); 
    
    RunJobFlowRequest request = new RunJobFlowRequest()
                .withName("clusterName")
                .withReleaseLabel("emr-5.23.0")
                .withApplications("<Added following in applications Hadoop,Spark,Ganglia,Zeppelin>")
                .withLogUri("s3 path")
                .withServiceRole("EMR_DefaultRole")
                .withJobFlowRole("name-of-role-created")
                .withInstances(jobFlowInstanceConfig)
                .withSecurityConfiguration("Security configuration Name");
    
    RunJobFlowResult runJobFlowResult = emrClient.runJobFlow(request);
    

I get following error:

com.amazonaws.services.elasticmapreduce.model.AmazonElasticMapReduceException: Role '' is not well-formed. (Service: AmazonElasticMapReduce; Status Code: 400; Error Code: ValidationException; Request ID: 0d5ed77e-ed0e-49fd-bd33-f88213ce08c3)
    at com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleErrorResponse(AmazonHttpClient.java:1701)
    at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1356)
    at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1102)
    at com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:759)
    at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:733)
    at com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:715)
    at com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:675)
    at com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:657)
    at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:521)
    at com.amazonaws.services.elasticmapreduce.AmazonElasticMapReduceClient.doInvoke(AmazonElasticMapReduceClient.java:2043)
    at com.amazonaws.services.elasticmapreduce.AmazonElasticMapReduceClient.invoke(AmazonElasticMapReduceClient.java:2010)
    at com.amazonaws.services.elasticmapreduce.AmazonElasticMapReduceClient.invoke(AmazonElasticMapReduceClient.java:1999)
    at com.amazonaws.services.elasticmapreduce.AmazonElasticMapReduceClient.executeRunJobFlow(AmazonElasticMapReduceClient.java:1770)
    at com.amazonaws.services.elasticmapreduce.AmazonElasticMapReduceClient.runJobFlow(AmazonElasticMapReduceClient.java:1742)

As the above error said that role is not properly formatted, I tried with different formats as well still got the same issue. Below are different formats which I added in .withJobFlowRole("name-of-role-created")

arn:aws:iam::639116131780:role/name-of-role-created
arn:aws:iam::639116131780:instance-profile/name-of-role-created
arn:aws:iam::639116131780:role/name-of-role-created/*
arn:aws:iam::639116131780:instance-profile/name-of-role-created/*
arn:aws:sts::639116131780:assumed-role/name-of-role-created
arn:aws:sts::639116131780:assumed-role/name-of-role-created/*

I get the same error everytime.

com.amazonaws.services.elasticmapreduce.model.AmazonElasticMapReduceException: Role 'arn:aws:iam::639116131780:role/name-of-role-created' is not well-formed. (Service: AmazonElasticMapReduce; Status Code: 400; Error Code: ValidationException; Request ID: 0d5ed77e-ed0e-49fd-bd33-f88213ce08c3)

Upvotes: 1

Views: 700

Answers (2)

Lamanus
Lamanus

Reputation: 13541

JobFlowRole is the role that applied to the EMR instances and it is not the role to use when you create the EMR. I think You misread the option.

If you want to apply the role for not using API Keys, then you have to dig your AWS credentials. For example in S3,

S3Client s3 = S3Client.builder()
              .credentialsProvider(InstanceProfileCredentialsProvider.builder().build())
              .build();

where

InstanceProfileCredentialsProvider.builder().build()

uses the role of the instance.

Upvotes: 0

madhead
madhead

Reputation: 33374

According to the docs, the JobFlowRole parameter is not an ARN, but just a string, like EMR_EC2_DefaultRole (the default value). Use a format like that.

Upvotes: 1

Related Questions