Reputation: 729
I am trying to pass in a list of regexes to the columns attribute in my happybase scan calls. This is because, my coloumn names are made by dynamically appending ids which i dont have acces to at scan time.
Is this possible?
Upvotes: 0
Views: 1550
Reputation: 3831
I am not familiar with HBase's syntax. Here is the happybase-python code I used, and it works for me. Thanks to Wouter Bolsterlee!! Not like the 'columns' statement, you don't have to put 'columnFamily' in 'ColumnPrefixFilter'.
import happybase
pool = happybase.ConnectionPool(size=3, host='172.xx.xx.xx')
with pool.connection() as conn1:
hbaseTable = conn1.table('HBase_table_name_here')
for rowKey, rowData in hbaseTable.scan(row_prefix= 'year-2015-', filter="ColumnPrefixFilter('month-06')", limit = 6):
print rowData
Upvotes: 1
Reputation: 4037
HappyBase author here.
According to the Thrift API you can pass regular expressions in the columns
argument for the ScannerOpen()
API family (see http://svn.apache.org/viewvc/hbase/trunk/hbase-thrift/src/main/resources/org/apache/hadoop/hbase/thrift/Hbase.thrift?view=markup#l717). However, the Thrift API used by HappyBase is ScannerOpenWithScan()
, which uses the TScan
struct (see http://svn.apache.org/viewvc/hbase/trunk/hbase-thrift/src/main/resources/org/apache/hadoop/hbase/thrift/Hbase.thrift?view=markup#l141), which does not contain any remark about regular expressions. Actually I don't know (without testing) whether this works.
A more flexible and powerful way is to specify a filter string using the filter
argument to happybase.Table.scan()
. See http://hbase.apache.org/book/thrift.html for the filter string syntax. In your case, something like "ColumnPrefixFilter('theprefix')"
should do the trick. See http://happybase.readthedocs.org/en/latest/api.html#happybase.Table.scan for the HappyBase API.
Upvotes: 3