Reputation: 4557
There is an empty HBase table with two column families:
create 'emp', 'personal_data', 'professional_data'
Now I am trying to map a Hive external table to it, which would naturally have some columns:
CREATE EXTERNAL TABLE emp(id int, city string, name string, occupation string, salary int)
STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":id,
personal_data:city,
personal_data:name,
professional_data:occupation,
professional_data:salary")
TBLPROPERTIES ("hbase.table.name" = "emp", "hbase.mapred.output.outputtable" = "emp");
Now the error that I get is this:
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. java.lang.RuntimeException: MetaException(message:org.apache.hadoop.hive.serde2.SerDeException org.apache.hadoop.hive.hbase.HBaseSerDe: columns has 5 elements while hbase.columns.mapping has 6 elements (counting the key if implicit))
Could you please help me out? Am i doing something wrong?
Upvotes: 2
Views: 5347
Reputation: 5315
In your mapping, you're referencing the id
field but you should reference the HBase key
keyword. As stated in the documentation :
a mapping entry must be either :key or of the form column-family-name:[column-name][#(binary|string)
Just replace :id
by :key
and that should do it :
CREATE EXTERNAL TABLE emp(id int, city string, name string, occupation string, salary int)
STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,
personal_data:city,
personal_data:name,
professional_data:occupation,
professional_data:salary")
TBLPROPERTIES ("hbase.table.name" = "emp", "hbase.mapred.output.outputtable" = "emp");
The column mapping is based on the ordering of the columns, not on their names. In the documentation, paragraph Multiple Columns and Families you can clearly see that the names don't matter
CREATE TABLE hbase_table_1(key int, value1 string, value2 int, value3 int)
STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
WITH SERDEPROPERTIES (
"hbase.columns.mapping" = ":key,a:b,a:c,d:e"
)
The mapping is then
Upvotes: 5