priya
priya

Reputation: 26699

How to find the number of columns within a row key in hbase

How to find the number of columns within a row key in hbase (since a row can have many columns)

Upvotes: 3

Views: 3143

Answers (3)

Sun Wei
Sun Wei

Reputation: 21

Thanks for @user3375803, actually you don't have to use external txt file. Because I can not comment on your answer, so I leave my answer below:

echo "scan 'mytable', {STARTROW=>'mystartrow', ENDROW=>'myendrow'}" | hbase shell | wc -l | awk '{print $1-8}'

Upvotes: 2

user3375803
user3375803

Reputation: 149

There is a simple way:

Use hbase shell to scan through the table and write the output to a intermediate text file. Because hbase shell output splits each column of a row into a new line, we can just count the lines inside the text file (minus the first 6 lines which are hbase shell standard output and the last 2 lines).

echo "scan 'mytable', {STARTROW=>'mystartrow', ENDROW=>'myendrow'}" | hbase shell > row.txt
wc -l row.txt

Make sure to select the appropriate row keys, as the borders are not inclusive.

If you are only interested into specific columns (families), apply the filters in the hbase shell command above (e.g. FamilyFilter, ColumnRangeFilter, ...).

Upvotes: 2

Arnon Rotem-Gal-Oz
Arnon Rotem-Gal-Oz

Reputation: 25909

I don't think there's a direct way to do that as each row can have a different number of columns and they may be spread over several files.

If you don't want to bring the whole row to the client to perform the count there you can write an endpoint coprocessor (HBase version for a stored procedure if you like) to perform the calculation on the region server side and only return the result. you can read a little about coprocessors here

Upvotes: 1

Related Questions