Reputation: 91
I am running a 3 node HDP 2.2 cluster. Oozie version is 4.1.0.2.2 and Sqoop version is 1.4.5.2.2. I am using Sqoop job to do incremental imports from RDBMS into HDFS as shown below,
sqoop job –create JOB1 –meta-connect “jdbc:hsqldb:hsql://ip-address:16000/sqoop” — import –connect jdbc:oracle:thin:@ip-address:db –username db_user –password-file hdfs://ip-address:8020/user/oozie/.password_sqoop –table TABLE1 –target-dir /user/incremental/ –incremental lastmodified –check-column LAST_UPDATED –last-value “2013-08-12 18:13:44.0″ –merge-key ID –fields-terminated-by ‘|';
sqoop job –exec JOB1
The above 2 sqoop commands are working perfectly fine when run from command prompt. I am using sqoop-metastore (HSQLDB) to store the sqoop jobs.
The sqoop create job is working in OOZIE and i am able to see the sqoop job being listed in the sqoop-metastore after oozie job completes.
But when i put the sqoop exec job in the oozie workflow i get, “Main class [org.apache.oozie.action.hadoop.SqoopMain], exit code [1]”
error. The underlying map reduce job however shows as completed successfully. Checked for logs in /var/log/oozie but nothing there as well.
workflow.xml :
<workflow-app xmlns=”uri:oozie:workflow:0.4″ name=”oozie-wf”>
<start to=”sqoop-wf”/>
<action name=”sqoop-wf”>
<sqoop xmlns=”uri:oozie:sqoop-action:0.2″>
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<configuration>
<property>
<name>fs.hdfs.impl.disable.cache</name>
<value>true</value>
</property>
</configuration>
<command>job –meta-connect “jdbc:hsqldb:hsql://ip-address:16000/sqoop” –exec JOB1</command>
</sqoop>
<ok to=”end”/>
<error to=”fail”/>
</action>
<kill name=”fail”>
<message>Failed, Error Message[${wf:errorMessage(wf:lastErrorNode())}]</message>
</kill>
<end name=”end”/>
</workflow-app>
job.properties :
nameNode=hdfs://ip-address:8020
jobTracker=ip-address:8050
oozie.wf.application.path=hdfs://ip-address:8020/user/oozie/sqoopoozie
oozie.use.system.libpath=true
oozie.sqoop.log.level=DEBUG
I have tried in multiple different ways for sqoop exec job in oozie but nothing works. Please help.
Upvotes: 0
Views: 1636
Reputation: 51
In the case of our cluster, creating jobs through Hue UI without specifying the metastore (using the --meta-connect
parameter) resulted in jobs being only sometimes returned by the sqoop job --list
command. In command prompt creating and listing jobs on the same machine worked; however, machine B could not find jobs created on machine A.
If you also can't access jobs can access jobs created on a different node, it seems like your sqoop metastore is not running. To solve it:
sqoop-metastore
command,$ ps aux | grep sqoop
should return one of the results as a Sqoop metastore,sqoop-site.xml
files. In our case we couldn't make ot work though, so we did it using the next step,$ sqoop job --exec my_job --meta-connect jdbc:hsqldb:hsql://node-running-metastore:16000/sqoop
. Note: To start the metastore, consider using &
and nohup
to run the metastore process in the background and to capture logs into a file:
$ nohup sqoop-metastore &>> /tmp/nohup-sqoop-metastore.out &
Upvotes: 0
Reputation: 775
It puzzles me as to how sqoop job --exec JOB1
works in the command prompt. The right syntax for executing the job (from command prompt) would be
sqoop job --exec JOB1 --meta-connect jdbc:hsqldb:hsql://ip-address:16000/sqoop
Perhaps you executed the job from the same machine that is running your sqoop metastore
Please make sure you have done the following
As for the oozie workflow, try the following workflow.xml
<workflow-app xmlns=”uri:oozie:workflow:0.4″ name=”oozie-wf”>
<start to=”sqoop-wf”/>
<action name=”sqoop-wf”>
<sqoop xmlns=”uri:oozie:sqoop-action:0.2″>
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<configuration>
<property>
<name>fs.hdfs.impl.disable.cache</name>
<value>true</value>
</property>
</configuration>
<arg>job</arg>
<arg>--exec</arg>
<arg>JOB1</arg>
<arg>--meta-connect jdbc:hsqldb:hsql://ip-address:16000/sqoop</arg>
</sqoop>
<ok to=”end”/>
<error to=”fail”/>
</action>
<kill name=”fail”>
<message>Failed, Error Message[${wf:errorMessage(wf:lastErrorNode())}]</message>
</kill>
<end name=”end”/>
</workflow-app>
Upvotes: 1
Reputation: 131
Please ensure you are specifying your metastore in sqoop-site.xml and you are passing the site xml in your oozie workflow folder. You can pass it with xml tag .
Upvotes: 1