Reputation: 43
I am trying to transfer data from Vertica to hive. According to the manual the following should be set as the input format:
-inputformat com.vertica.hadoop.deprecated.VerticaStreamingInput
But the hadoop-vertica jar has org.apache.hadoop.vertica.VerticaStreamingInput
class and not the above.
So it is throwing me the following exception:
Exception in thread "main" java.lang.RuntimeException:
class org.apache.hadoop.vertica.VerticaStreamingInput not
org.apache.hadoop.mapred.InputFormat
The full command is:
$HADOOP_HOME/bin/hadoop jar $HADOOP_HOME/contrib/streaming/hadoop-streaming-*.jar \
-libjars $HADOOP_HOME/lib/hadoop-vertica.jar \
-Dmapred.vertica.hostnames=VerticaHost \
-Dmapred.vertica.database=ExampleDB \
-Dmapred.vertica.username=ExampleUser \
-Dmapred.vertica.password=password123 \
-Dmapred.vertica.port=5433 \
-Dmapred.vertica.input.query="SELECT * FROM allTypes ORDER BY key" \
-Dmapred.vertica.input.delimiter=, \
-Dmapred.map.tasks=1 \
-inputformat com.vertica.hadoop.deprecated.VerticaStreamingInput \
-input /tmp/input -output /tmp/output -reducer /bin/cat -mapper /bin/cat
Hive is CDH-4.4.0-1.cdh4.4.0.p0.39/ environment and vertica is 7.1.
If I have the wrong hadoop-vertica jar, where can I get the correct one? If that's not the problem, what am I doing wrong?
So where does this com.vertica.hadoop.deprecated.VerticaStreamingInput class come from? I got it from installing the Vertica connector.(Page 9 , step 5) https://my.vertica.com/docs/7.0.x/PDF/HP_Vertica_7.0.x_HadoopIntegration.pdf
Upvotes: 0
Views: 335
Reputation: 1589
I just downloaded the Hadoop Connector for MapReduce from the downloads page on my.vertica.com. I took the 2.0 version (which supports CDH 4), since that's the Hadoop version you said you're using.
I looked in the hadoop-vertica.jar file in the downloaded file (yarn-vertica_1.6.0.zip) and it has the class file in the correct place (com.vertica.hadoop.deprecated). The zip file also includes the source, so you can inspect it.
I can't tell where org.apache.hadoop.vertica.VerticaStreamingInput is coming from, but downloading a fresh copy of the connector should fix your problem. Make sure download the JDBC driver and do the other Java configuration described in the documentation.
Upvotes: 3