Reputation: 189
I want to add jobs from my java code in eclipse to a running cluster of EMR for saving startup time (creating ec2, bootstrapping...).
I know how to run a new cluster from java code but it's terminating after all jobs are done.
RunJobFlowRequest runFlowRequest = new RunJobFlowRequest()
.withName("Some name")
.withInstances(instances)
// .withBootstrapActions(bootstrapActions)
.withJobFlowRole("EMR_EC2_DefaultRole")
.withServiceRole("EMR_DefaultRole")
.withSteps(firstJobStep, secondJobStep, thirdJobStep)
.withLogUri("s3n://path/to/logs");
// Run the jobs
RunJobFlowResult runJobFlowResult = mapReduce
.runJobFlow(runFlowRequest);
String jobFlowId = runJobFlowResult.getJobFlowId();
Upvotes: 1
Views: 252
Reputation: 2546
You have to set KeepJobFlowAliveWhenNoSteps
parameter to TRUE
, otherwise the cluster will be terminated after executing all the steps. If this property is set, the cluster will continue in waiting state after executing all the steps.
Add .withKeepJobFlowAliveWhenNoSteps(true)
to the existing code.
Refer this doc for further details.
Upvotes: 1