João M. S. Silva
João M. S. Silva

Reputation: 1148

Shouldn't Oozie/Sqoop jar location be configured during package installation?

I'm using HDP 2.4 in CentOS 6.7.

I have created the cluster with Ambari, so Oozie was installed and configured by Ambari.

I got two errors while running Oozie/Sqoop related to jar file location. The first concerned postgresql-jdbc.jar, since the Sqoop job is incrementally importing from Postgres. I added the postgresql-jdbc.jar file to HDFS and pointed to it in workflow.xml:

<file>/user/hdfs/sqoop/postgresql-jdbc.jar</file>

It solved the problem. But the second error seems to concern kite-data-mapreduce.jar. However, doing the same for this file:

<file>/user/hdfs/sqoop/kite-data-mapreduce.jar</file>

does not seem to solve the problem:

Failing Oozie Launcher, Main class [org.apache.oozie.action.hadoop.SqoopMain], main() threw exception, org/kitesdk/data/DatasetNotFoundException java.lang.NoClassDefFoundError: org/kitesdk/data/DatasetNotFoundException

It seems strange that this is not automatically configured by Ambari and that we have to copy jar files into HDFS as we start getting errors.

Is this the correct methodology or did I miss some configuration step?

Upvotes: 1

Views: 760

Answers (1)

YoungHobbit
YoungHobbit

Reputation: 13402

This is happening due to the missing jars in the classpath. I would suggest you to use the property oozie.use.system.libpath=true in the job.properties file. All the sqoop related jars will be added automatically in the classpath. Then add only custom jar you need to the lib directory of the workflow application path., all the sqoop related jars will be added from the /user/oozie/share/lib/lib_<timestamp>/sqoop/*.jar.

Upvotes: 2

Related Questions