Reputation: 33
Trying to load tsv file in HBase running in HDInsight in Microsoft Azure cloud using a recommended approach connecting through Remote Desktop and running on the command line trying to load t1.tsv file (with two tab separated columns) from hdfs into hbase t1 table: C:\apps\dist\hbase-0.98.0.2.1.5.0-2057-hadoop2\bin>hbase org.apache.hadoop.hbase.mapreduce.ImportTsv -Dimporttsv.columns=HBASE_ROW_KEY,num t1 t1.tsv
and get: ERROR: One or more columns in addition to the row key and timestamp(optional) are required Usage: importtsv -Dimporttsv.columns=a,b,c
replacing order of the specified columns to num,HBASE_ROW_KEY C:\apps\dist\hbase-0.98.0.2.1.5.0-2057-hadoop2\bin>hbase org.apache.hadoop.hbase.mapreduce.ImportTsv -Dimporttsv.columns=num,HBASE_ROW_KEY t1 t1.tsv I get: ERROR: Must specify exactly one column as HBASE_ROW_KEY Usage: importtsv -Dimporttsv.columns=a,b,c
This tells me that comma separator in the column list is not recognized or column name is incorrect I also tried to use column with qualifier as num:v and as 'num' - nothing helps
Any ideas what could be wrong here? Thanks.
Upvotes: 0
Views: 1071
Reputation: 979
>hbase org.apache.hadoop.hbase.mapreduce.ImportTsv -Dimporttsv.columns="HBASE_ROW_KEY,d:c1,d:c2" testtable /example/inputfile.txt
This works for me. I think there are some differences between terminals in Linux and Windows, thus in windows you need to add quotation marks to clarify this is a value string, otherwise might not be recognized.
Upvotes: 1