Hari
Hari

Reputation: 729

Passing regex to coloumn attribute in happybase scan

I am trying to pass in a list of regexes to the columns attribute in my happybase scan calls. This is because, my coloumn names are made by dynamically appending ids which i dont have acces to at scan time.

Is this possible?

Upvotes: 0

Views: 1550

Answers (2)

kennyut
kennyut

Reputation: 3831

I am not familiar with HBase's syntax. Here is the happybase-python code I used, and it works for me. Thanks to Wouter Bolsterlee!! Not like the 'columns' statement, you don't have to put 'columnFamily' in 'ColumnPrefixFilter'.

import happybase
pool = happybase.ConnectionPool(size=3, host='172.xx.xx.xx')
with pool.connection() as conn1:
    hbaseTable = conn1.table('HBase_table_name_here')
    for rowKey, rowData in hbaseTable.scan(row_prefix= 'year-2015-', filter="ColumnPrefixFilter('month-06')", limit = 6):
        print rowData

Upvotes: 1

wouter bolsterlee
wouter bolsterlee

Reputation: 4037

HappyBase author here.

According to the Thrift API you can pass regular expressions in the columns argument for the ScannerOpen() API family (see http://svn.apache.org/viewvc/hbase/trunk/hbase-thrift/src/main/resources/org/apache/hadoop/hbase/thrift/Hbase.thrift?view=markup#l717). However, the Thrift API used by HappyBase is ScannerOpenWithScan(), which uses the TScan struct (see http://svn.apache.org/viewvc/hbase/trunk/hbase-thrift/src/main/resources/org/apache/hadoop/hbase/thrift/Hbase.thrift?view=markup#l141), which does not contain any remark about regular expressions. Actually I don't know (without testing) whether this works.

A more flexible and powerful way is to specify a filter string using the filter argument to happybase.Table.scan(). See http://hbase.apache.org/book/thrift.html for the filter string syntax. In your case, something like "ColumnPrefixFilter('theprefix')" should do the trick. See http://happybase.readthedocs.org/en/latest/api.html#happybase.Table.scan for the HappyBase API.

Upvotes: 3

Related Questions