Reputation: 535
I have data in HDFS. And I wanted to load that data into hbase and hive table. I have written a bash shell script in which I have written a pig script to load the data form HDFS to HBASE and also written hive script to load the data from HDFS to HIVE table which are working perfectly fine.Here my HDFS data files are with the same structure and I'm loading all the data files into single hbase and hive table.
Now my query is suppose if I receive some more data files in HDFS directory and if I run the shell script again it will create hbase and hive table again with the same name and tells table already exists. How can I write a hive and hbase query so that 1st it will check for the table existence, if table does not exists it create the table for the 1st time and load the data from HDFS to HBASE & Hive table. If the table is already exists then it will just insert the data into an existing hbase and hive table. It should not overwrite the data alreday exists in the tables. How this can be done ?
Below is my script file: myScript.sh
echo "create 'goodtable','gt'" | hbase shell
pig -f a.pig -param input=/user/user/d/
hive -f h.hql
Where a.pig :
G = LOAD '$input' USING PigStorage(',') as (c1:chararray, c2:chararray,c3:chararray,c4:chararray,c5:chararray);
STORE G INTO 'hbase://goodtable' USING org.apache.pig.backend.hadoop.hbase.HBaseStorage('gt:name gt:state gt:phone_no gt:gender');
h.hql:
create external table hive_table(
id int,
name string,
state string,
phone_no int,
gender string) row format delimited fields terminated by ',' stored as textfile;
LOAD DATA INPATH '/user/user/d/' INTO TABLE hive_table;
Upvotes: 0
Views: 5658
Reputation: 8937
I just wanted to add an example for HBase as Hive was already covered before:
if [[ $(echo "exists 'goodtable'" | hbase shell | grep 'not exist') ]];
then
echo "create 'goodtable','gt'" | hbase shell;
fi
Upvotes: 2
Reputation: 231
@visakh is correct - you can see if table exists in HBase by entering the HBase shell, and typing : exists '<tablename>
In order to do this without entering the HBase shell interactively, you can create a simple ruby script such as the following:
exists 'mytable'
exit
Let's say you save this to a file called tabletest.rb. You can then execute this script by calling hbase shell tabletest.rb
. This will create the following output, which you can then parse from your shell script:
Table tableisthere does exist
0 row(s) in 0.9830 seconds
OR
Table tableisNOTthere does not exist
0 row(s) in 0.9830 seconds
Adding more details for 'all in one' script:
Alternatively, you can create a more advanced script in ruby that checks for table existence and then will create it if needed - this is done calling the HBaseAdmin java api from within the ruby script.
conf = HBaseConfiguration.new
hbaseAdmin = HBaseAdmin.new(conf)
if !hbaseAdmin.tableExists('mytable')
hbaseAdmin.createTable('mytable',...)
end
Upvotes: 0
Reputation: 2553
For HIVE
, you can add the command IF NOT EXISTS
in the CREATE TABLE
statement. See the documentation
I don't have much experience on Hbase
, but I believe you can use EXISTS table_name
command to check whether the table exists and then create
the table is it doesn't exist. See here
Upvotes: 0