Use elephant-bird with hive to read protobuf data

Question

I have a similar problem like this one

The followning are what I used:

CDH4.4 (hive 0.10)
protobuf-java-.2.4.1.jar
elephant-bird-hive-4.6-SNAPSHOT.jar
elephant-bird-core-4.6-SNAPSHOT.jar
elephant-bird-hadoop-compat-4.6-SNAPSHOT.jar
The jar file which include the protoc compiled .class file.

And I flow Protocol Buffer java tutorial create my data "testbook".

And I

use hdfs dfs -mkdir /protobuf_data to create HDFS folder.

Use hdfs dfs -put testbook /protobuf_data to put "testbook" to HDFS.

Then I follow elephant-bird web page to create table, syntax is like this:

create table addressbook
  row format serde "com.twitter.elephantbird.hive.serde.ProtobufDeserializer"
  with serdeproperties (
    "serialization.class"="com.example.tutorial.AddressBookProtos$AddressBook")
  stored as
    inputformat "com.twitter.elephantbird.mapred.input.DeprecatedRawMultiInputFormat"
    OUTPUTFORMAT "org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat"
  LOCATION '/protobuf_data/';

All worked.

But when I submit the query select * from addressbook; no result came out.

And I couldn't find any logs with errors to debug.

Could someone help me ?

Many thanks

Use elephant-bird with hive to read protobuf data

Answers (1)

Related Questions