Reputation: 125
I have a litte question here about how to filter rowkey when loading data from hbase, For now i've been doing like this
pigServer.registerQuery("$result = LOAD 'hbase://reach.${campaign.appId}' "
+ "USING org.apache.pig.backend.hadoop.hbase.HBaseStorage('data:queued data:dropped', "
+ "'-loadKey -gte=key1 -lte=key20') "
But this only allows me to get a range of key, from key1 to key20, what i want is to be able to precise the key, not the range of key,for example i only want key3,key5,key7....
Is there a thing like "filter by...." or sth like that we can use ? Thanks !
Upvotes: 1
Views: 729
Reputation: 1714
There is currently no way to do that with HBaseStorage, but check out http://phoenix.apache.org. You can do an IN query which uses skip scan to very efficiently return a list of individual keys.
Upvotes: 1