NSA
NSA

Reputation: 6027

When running a custom Jar in Amazon EMR where is my jar run from?

All,

I am packaging my custom jar with all its dependencies, one of these conflicts with another jar on the EMR instance, so I want to add a step to set my classpath to the directory containing my custom jar, but to do that I need to know where that jar will reside on the various nodes and if there are any env vars that I can use to make these changes, if someone knows of a better way to resolve the root problem other than building against the same version of the jars on the EMR as that is not possible that would also be welcome input.

Thank you,

Upvotes: 0

Views: 2212

Answers (1)

user1452132
user1452132

Reputation: 1768

You can look at 'controller' log file to see where the jar file is copied to - and the hadoop streaming jar command run.

The custom jar gets copied to some folder under /mnt/var/hadoop (specific to the step). The only way this jar file would conflict with another jar is - if the it is under hadoop/lib - and part of classpath of 'hadoop jar'.

One solution is to use script-runner step to run the jar file instead of 'custom jar'. Here you may override classpath - set any relevant environment vars.

http://docs.aws.amazon.com/ElasticMapReduce/latest/DeveloperGuide/emr-hadoop-script.html

This script may even run your customer jar using 'hadoop jar' command - in essence almost equivalent to running the custom jar directly.

Upvotes: 1

Related Questions