Reputation: 11
I'm currently trying to load data from Hive to ElasticSearch. I'm using cloudera CDH 5.3. I've already added the hadoop-es hive 2.0.2 jar to my hive path. I have ElasticSearch 1.4.4 up and running on 10.44.162.169.
I now have a table called hive_cdr with following properties:
traffic_type_id (big int)
appelant (int)
called_number (int)
call_duration (int)
location_number (string)
date_heure_appel(string)
I'm trying to define the ES table in my hive to load in some data. To do so, I've done this:
CREATE EXTERNAL TABLE es_hive_cdr (
traffic bigint ,
calling int ,
called int ,
duration int ,
location string ,
date string )
ROW FORMAT SERDE 'org.elasticsearch.hadoop.hive.EsSerDe'
STORED BY 'org.elasticsearch.hadoop.hive.EsStorageHandler'
TBLPROPERTIES (
'es.nodes'='10.44.162.169',
'es.resource'='indexCDR/typeCDR'
) ;
But, I got this exception saying that the EsStorage is not recognized.
I've deleted the EsStorage line and executed to try to find out what's going on.
Tried now to load data from my hive_cdr table to my new one :
insert into table es_hive_cdr2
select
traffic_type_id,
appelant,
called_number,
call_duration,
location_number,
date_heure_appel
from hive_cdr;
But It's failing and I got this error :
STAGE DEPENDENCIES:
Stage-1 is a root stage
Stage-7 depends on stages: Stage-1 , consists of Stage-4, Stage-3, Stage-5
Stage-4
Stage-0 depends on stages: Stage-4, Stage-3, Stage-6
Stage-2 depends on stages: Stage-0
Stage-3
Stage-5
Stage-6 depends on stages: Stage-5
STAGE PLANS:
Stage: Stage-1
Map Reduce
Map Operator Tree:
TableScan
alias: hive_cdr
Statistics: Num rows: 267130 Data size: 58768736 Basic stats: COMPLETE Column stats: NONE
Select Operator
expressions: traffic_type_id (type: bigint), appelant (type: int), called_number (type: int), call_duration (type: int), location_number (type: string), date_heure_appel (type: string)
outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5
Statistics: Num rows: 267130 Data size: 58768736 Basic stats: COMPLETE Column stats: NONE
File Output Operator
compressed: false
Statistics: Num rows: 267130 Data size: 58768736 Basic stats: COMPLETE Column stats: NONE
table:
input format: org.apache.hadoop.mapred.TextInputFormat
output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
serde: org.elasticsearch.hadoop.hive.EsSerDe
name: default.es_hive_cdr2
Stage: Stage-7
Conditional Operator
Stage: Stage-4
Move Operator
files:
hdfs directory: true
destination: hdfs://master:8020/user/hive/warehouse/es_hive_cdr2/.hive-staging_hive_2015-03-02_14-09-08_285_4734041865540737822-2/-ext-10000
Stage: Stage-0
Move Operator
tables:
replace: false
table:
input format: org.apache.hadoop.mapred.TextInputFormat
output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
serde: org.elasticsearch.hadoop.hive.EsSerDe
name: default.es_hive_cdr2
Stage: Stage-2
Stats-Aggr Operator
Stage: Stage-3
Map Reduce
Map Operator Tree:
TableScan
File Output Operator
compressed: false
table:
input format: org.apache.hadoop.mapred.TextInputFormat
output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
serde: org.elasticsearch.hadoop.hive.EsSerDe
name: default.es_hive_cdr2
Stage: Stage-5
Map Reduce
Map Operator Tree:
TableScan
File Output Operator
compressed: false
table:
input format: org.apache.hadoop.mapred.TextInputFormat
output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
serde: org.elasticsearch.hadoop.hive.EsSerDe
name: default.es_hive_cdr2
Stage: Stage-6
Move Operator
files:
hdfs directory: true
destination: hdfs://master:8020/user/hive/warehouse/es_hive_cdr2/.hive-staging_hive_2015-03-02_14-09-08_285_4734041865540737822-2/-ext-10000
I'm really in need of some help and guidance and be way to appreciative and thankful for you !
Upvotes: 0
Views: 1995
Reputation: 21
Try to give table properties.
TBLPROPERTIES('es.resource' = 'myviews/myview', 'es.nodes' = 'hostname-of-es-cluster', 'es.port' = '9200', 'es.input.json' = 'false', 'es.write.operation' = 'index', 'es.index.auto.create' = 'yes','es.nodes.wan.only' = 'true');
Also change the property in your elasticsearch.yml file to below one
network.host: _site_
Upvotes: 0