Reputation: 1647
I am using a scanner to retrieve rows from HBase. I can set which columns I want back via the addColumn() method. However, I really need to be able to retrieve a variable number of columns that all start with the same prefix.
So, all the columns I want start with "USA", for example. I need to retrieve all columns that start with that, such as "USA-Virginia", "USA-Hawaii", etc. I do not want values such as "Canada-Quebec". There are no predefined values for the full column names anywhere. I just need all of them that start with "USA". Is there a way to get HBase Scanners to do this? I don't see much in the way of writing custom scanners out there.
I was looking at custom filters, but this just seems to limit the rows I get, as opposed to specifying the columns I want returned. Thoughts?
I cannot change the structure of my data, and all of my data is under a single column family.
Thanks for any ideas. I am running CDH3u4.
Upvotes: 3
Views: 1871
Reputation: 267
Try using org.apache.hadoop.hbase.filter.SubstringComparator
, that might solve your problem.
Upvotes: 0
Reputation: 666
What you need is the ColumnPrefixFilter to filter keys by their columns prefix
http://archive.cloudera.com/cdh/3/hbase/apidocs/org/apache/hadoop/hbase/filter/ColumnPrefixFilter.html
Something like this should do the trick :-
filter = new ColumnPrefixFilter(Bytes.toBytes("USA"))
Upvotes: 5