Reputation: 1810
I have .csv
files in HDFS
. I want to load these in HBASE
tables without using Pig
script.
Is there any other way available?
Upvotes: 2
Views: 330
Reputation: 29237
There might be several ways. But some of the options are like below.
ImportTsv
ImportTsv
is a utility that will load data in TSV format into HBase. It has two distinct usages: loading data from TSV format in HDFS into HBase via Puts, and preparing StoreFiles to be loaded via the completebulkload.
To load data via Put
s (i.e., non-bulk loading):
$ bin/hbase org.apache.hadoop.hbase.mapreduce.ImportTsv -Dimporttsv.columns=a,b,c <tablename> <hdfs-inputdir>
To generate StoreFiles for bulk-loading:
$ bin/hbase org.apache.hadoop.hbase.mapreduce.ImportTsv -Dimporttsv.columns=a,b,c -Dimporttsv.bulk.output=hdfs://storefile-outputdir <tablename> <hdfs-data-inputdir>
These generated StoreFiles can be loaded into HBase via Section 14.1.10, “CompleteBulkLoad”.
Example hbase> hbase org.apache.hadoop.hbase.mapreduce.ImportTsv -Dimporttsv.separator=, -Dimporttsv.columns="c1,c2,c3...." hdfs://servername:/tmp/yourcsv.csv
Write a mapreduce program and csv parser in case you need to parse the csv which is complex
Upvotes: 2