Reputation: 21
I have an XML file in HDFS, I want to load these XML files into HBase table.
I referred some of the links, they are using map reduce option to load the XML data into HBase, is there any alternate option available to load directly into HBase table.
Upvotes: 0
Views: 156
Reputation: 35
I have gave the example using input3.xml file loading using PIG into HBASE.
=== input3.xml =====
<document>
<url>htp://www.abc.com/</url>
<category>Sports</category>
<usercount>120</usercount>
<reviews>
<review>good site</review>
<review>This is Avg site</review>
<review>Bad site</review>
</reviews>
</document>
A = LOAD'input3.xml' using
org.apache.pig.piggybank.storage.XMLLoader('document').HBaseStorage as
(data:chararray);
B = foreach A GENERATE FLATTEN(REGEX_EXTRACT_ALL(data,'(?s)<document>.*?<url>
([^>]*?)</url>.*?<category>([^>]*?)</category>.*?<usercount>([^>]*?)</usercount>.*?
<reviews>.*?<review>\\s*([^>]*?)\\s*</review>.*?</reviews>.*?</document>')) as
(url:chararray,catergory:chararray,usercount:int,review:chararray);
Upvotes: 0