Donald Miner
Donald Miner

Reputation: 39893

Pig: Count number of keys in a map

I'd like to count the number of keys in a map in Pig. I could write a UDF to do this, but I was hoping there would be an easier way.

data = LOAD 'hbase://MARS1'
       USING org.apache.pig.backend.hadoop.hbase.HBaseStorage(
         'A:*', '-loadKey true -caching=100000')
       AS (id:bytearray, A_map:map[]);

In the code above, I want to basically build a histogram of id and how many items in column family A that key has.

In hoping, I tried c = FOREACH data GENERATE id, COUNT(A_map); but that unsurprisingly didn't work.

Or, perhaps someone can suggest a better way to do this entirely. If I can't figure this out soon I'll just write a Java MapReduce job or a Pig UDF.

Upvotes: 1

Views: 673

Answers (1)

Chris White
Chris White

Reputation: 30089

SIZE should apparently work for you (not tried it myself):

Upvotes: 2

Related Questions