Reputation: 438
We work with Cloudera cdh 5.4.0, and have been trying to trigger a oozie job from Java API to send out emails. There is a dependency on two 3rd party jar files - activation.jar and mail.jar for the email to be send out using the SMTP Login. The JAVA program works to send out email works fine from the IDE / packages Jar files when the 3rd party files are placed in the same folder on the file system.
But when we move the files to HDFS and try to configure the oozie job, it is failing to complete.
We have our oozie job xml as below (email.xml):
<workflow-app name="Email" xmlns="uri:oozie:workflow:0.5">
<start to="java-95a1"/>
<kill name="Kill">
<message>Action failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
</kill>
<action name="java-95a1">
<java>
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<main-class>org.Emails</main-class>
<java-opts>[{u'value': u''}]</java-opts>
</java>
<ok to="End"/>
<error to="Kill"/>
</action>
<end name="End"/>
</workflow-app>
And job properties as :
nameNode=hdfs://localhost:8020
jobTracker=localhost:8021
queueName=default
weatherRoot=weather_ooze
mapreduce.jobtracker.kerberos.principal=foo
dfs.namenode.kerberos.principal=foo
oozie.libpath=${nameNode}/user/oozie/share/lib
oozie.wf.application.path=${nameNode}/user/${user.name}/${weatherRoot}
outputDir=weather-ooze
The files are placed in the HDFS folder as:
/user/oozie/OozieWFConfigs/emailAppDef/EmailJavaProgram.jar /user/oozie/OozieWFConfigs/emailAppDef/email.xml /user/oozie/OozieWFConfigs/emailAppDef/job.properties /user/oozie/OozieWFConfigs/emailAppDef/lib/activation.jar /user/oozie/OozieWFConfigs/emailAppDef/lib/mail.jar
Read in a forum that the jar files placed in the lib folder will be picked up automatically.
The oozie job is triggered using the Java API as :
import java.util.Properties;
import org.apache.oozie.client.OozieClient;
import org.apache.oozie.client.WorkflowJob;
public class oozieclient {
public static void main(String[] args) {
OozieClient wc = new OozieClient("http://hdfs:[email protected]:11000/oozie");
Properties conf = wc.createConfiguration();
conf.setProperty("nameNode", "hdfs://kwt-dev-hdpdn6.hadoop.local:8020");
conf.setProperty("jobTracker", "kwt-dev-hdpdn6.hadoop.local:8032");
conf.setProperty("queueName", "default");
conf.setProperty("oozie.libpath", "${nameNode}/user/oozie/OozieWFConfigs/emailAppDef/lib");
conf.setProperty("oozie.use.system.libpath", "true");
conf.setProperty("oozie.wf.rerun.failnodes", "true");
conf.setProperty("oozieProjectRoot",
"${nameNode}/user/oozie");
conf.setProperty("appPath",
"${oozieProjectRoot}/OozieWFConfigs/emailAppDef");
conf.setProperty(OozieClient.APP_PATH, "${appPath}/email.xml");
// conf.setProperty("inputDir", "${oozieProjectRoot}/data/*/*/*/*/*");
conf.setProperty("outputDir", "${appPath}/output");
try {
String jobId = wc.run(conf);
System.out.println("Workflow job, " + jobId + " submitted");
while (wc.getJobInfo(jobId).getStatus() == WorkflowJob.Status.RUNNING) {
System.out.println("Workflow job running ...");
Thread.sleep(10 * 1000);
}
System.out.println("Workflow job completed ...");
System.out.println(wc.getJobInfo(jobId));
} catch (Exception r) {
System.out.println("Errors " + r.getLocalizedMessage());
}
}
}
The job when triggered, runs until its 33% - 50% and then hangs. Neither terminates or proceeds. Could some one help me on this ? I cannot use the default email process in the oozie as I need to have attachments added on this email once its working. I want it to work from a java program using activation.jar and email.jar.
When the job is triggered, the configuration is as :
appPath hdfs://kwt-dev-hdpdn6.hadoop.local:8020/user/oozie/OozieWFConfigs/emailAppDef
jobTracker kwt-dev-hdpdn6.hadoop.local:8032
mapreduce.job.user.name oozie
nameNode hdfs://kwt-dev-hdpdn6.hadoop.local:8020
oozie.use.system.libpath true
oozie.wf.application.path hdfs://kwt-dev-hdpdn6.hadoop.local:8020/user/oozie/OozieWFConfigs/emailAppDef/email.xml
oozie.wf.rerun.failnodes true
oozieProjectRoot hdfs://kwt-dev-hdpdn6.hadoop.local:8020/user/jinith.joseph
outputDir hdfs://kwt-dev-hdpdn6.hadoop.local:8020/user/oozie/OozieWFConfigs/emailAppDef/output
queueName default
user.name oozie
Upvotes: 1
Views: 959
Reputation: 438
After a weeks trials, we have achieved emails being send out from oozie jobs. As many forums and friends had spotted, the issue was with the Guava version which did not contain the elapsedTime() function.
So if we have a workflow xml as below, it should work fine
<workflow-app name="Drill_HDFS_Email" xmlns="uri:oozie:workflow:0.5">
<start to="java-6abb"/>
<kill name="Kill">
<message>Action failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
</kill>
<action name="java-6abb">
<java>
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<configuration>
<property>
<name>mapred.queue.name</name>
<value>default</value>
</property>
<property>
<name>oozie.launcher.mapreduce.job.classloader</name>
<value>true</value>
</property>
<property>
<name>oozie.launcher.mapreduce.job.ubertask.enable</name>
<value>false</value>
</property>
</configuration>
<main-class>com.drill.Emails</main-class>
<file>/user/oozie/OozieWFConfigs/drillEmailAppDef/lib/DrillJDBC.jar#DrillJDBC.jar</file>
<file>/user/oozie/OozieWFConfigs/drillEmailAppDef/lib/activation.jar#activation.jar</file>
<file>/user/oozie/OozieWFConfigs/drillEmailAppDef/lib/mail.jar#mail.jar</file>
<file>/user/oozie/OozieWFConfigs/drillEmailAppDef/lib/drill-jdbc-all-1.0.0.jar#drill-jdbc-all-1.0.0.jar</file>
</java>
<ok to="End"/>
<error to="Kill"/>
</action>
<end name="End"/>
</workflow-app>
As you might have already observed, there are two configs specially acting on the picking of right version of guava.
oozie.launcher.mapreduce.job.classloader = true oozie.launcher.mapreduce.job.ubertask.enable = false
By default the ubertask is set to true, which will try to pick the guava jar of Cloudera / oozie which is working on a lower version and do not contain the elapsedTime() fucntion. If we set this property to false it will pick the Drill's jars, which contain the right guava version.
All the dependent 3rd party jars and the jar function with our code to send out emails are included as files in the oozie workflow. The main class function will be checked in the included jars and picked up.
In some forums, we have read that the jar files in the "lib" folder is read automatically. But we have not got it working without specifying explicitly. Probably we are still missing some configs !
Anyways Hope this helps someone in future.
Upvotes: 1
Reputation: 1360
The problem is related to the HDFS connection. Your Oozie URI, Namenode and jobtracker are not equal. I think you should replace the localhost with your correct IP address.
Upvotes: 1