whiterook6
whiterook6

Reputation: 3534

How do I specify a row key in hbase shell that has a tab in it?

In our infinite wisdom, we decided our rows would be keyed with a tab in the middle:

item_id <tab> location

For example:

000001  http://www.url.com/page

Using Hbase Shell, we cannot perform a get command because the tab character doesn't get written properly in the input line. We tried

get 'tableName', '000001\thttp://www.url.com/page'

without success. What should we do?

Upvotes: 6

Views: 6555

Answers (2)

Suman
Suman

Reputation: 9561

Hope you can change the tab character. :) Yeah that's a bad idea since Map Reduce jobs use the tab as a delimiter, and its generally a bad idea to use a tab or space as a delimiter.

You could use a double colon (::) as a delimiter. But wait, what if the URL has a double-colon in the URL? Well, urlencode the URL when you store it to HBase - that way, you have a standard delimiter, and the URL part of the key will not conflict with the delimiter.

In Python:

import urllib

DELIMITER = "::"
urlkey = urllib.quote_plus(location)

rowkey = item_id + DELIMITER + urlkey

Upvotes: 0

Pierre-Luc Bertrand
Pierre-Luc Bertrand

Reputation: 740

I had the same issue for binary values: \x00. This was my separator.

For the shell to accept your binary values, you need to provide them in double quote (") instead of single quote (').

put 'MyTable', "MyKey", 'Family:Qualifier', "\x00\x00\x00\x00\x00\x00\x00\x06Hello from shell"

Check how your tab is being encoded, my best bet would be that it is UTF8 encoded so from the ASCII table, this would be "000001\x09http://www.url.com/page".

On a side note, you should use null character for your separator, it will help you in scan.

Upvotes: 13

Related Questions