Rashmi
Rashmi

Reputation: 43

hadoop - vertica jar

I am trying to transfer data from Vertica to hive. According to the manual the following should be set as the input format:

-inputformat com.vertica.hadoop.deprecated.VerticaStreamingInput 

But the hadoop-vertica jar has org.apache.hadoop.vertica.VerticaStreamingInput class and not the above.

So it is throwing me the following exception:

Exception in thread "main" java.lang.RuntimeException:
  class org.apache.hadoop.vertica.VerticaStreamingInput not
  org.apache.hadoop.mapred.InputFormat

The full command is:

$HADOOP_HOME/bin/hadoop jar $HADOOP_HOME/contrib/streaming/hadoop-streaming-*.jar \  
-libjars $HADOOP_HOME/lib/hadoop-vertica.jar \   
-Dmapred.vertica.hostnames=VerticaHost \  
-Dmapred.vertica.database=ExampleDB \  
-Dmapred.vertica.username=ExampleUser \  
-Dmapred.vertica.password=password123 \  
-Dmapred.vertica.port=5433 \  
-Dmapred.vertica.input.query="SELECT * FROM allTypes ORDER BY key" \   
-Dmapred.vertica.input.delimiter=, \   
-Dmapred.map.tasks=1 \  
 -inputformat com.vertica.hadoop.deprecated.VerticaStreamingInput \  
 -input /tmp/input -output /tmp/output -reducer /bin/cat -mapper /bin/cat

Hive is CDH-4.4.0-1.cdh4.4.0.p0.39/ environment and vertica is 7.1.

If I have the wrong hadoop-vertica jar, where can I get the correct one? If that's not the problem, what am I doing wrong?

So where does this com.vertica.hadoop.deprecated.VerticaStreamingInput class come from? I got it from installing the Vertica connector.(Page 9 , step 5) https://my.vertica.com/docs/7.0.x/PDF/HP_Vertica_7.0.x_HadoopIntegration.pdf

Upvotes: 0

Views: 335

Answers (1)

Monica Cellio
Monica Cellio

Reputation: 1589

I just downloaded the Hadoop Connector for MapReduce from the downloads page on my.vertica.com. I took the 2.0 version (which supports CDH 4), since that's the Hadoop version you said you're using.

I looked in the hadoop-vertica.jar file in the downloaded file (yarn-vertica_1.6.0.zip) and it has the class file in the correct place (com.vertica.hadoop.deprecated). The zip file also includes the source, so you can inspect it.

I can't tell where org.apache.hadoop.vertica.VerticaStreamingInput is coming from, but downloading a fresh copy of the connector should fix your problem. Make sure download the JDBC driver and do the other Java configuration described in the documentation.

Upvotes: 3

Related Questions