hoducha
hoducha

Reputation: 132

Apache Pig: How to use LoadCaster to convert Writable objects to Pig types?

Can we load a sequence file of Writable KEY,VALUE pairs and convert the KEY,VALUE pairs to pig data types using the LoadCaster interface to convert the raw byte array's to pig data types?

If so, is there some example of the pig code that would be used to load the sequence file and invoke the LoadCaster?

Specifically I'm doing this currently:

A = LOAD '/tmp/part-m-00000' using SequenceFileLoader AS (key:bytearray, value:bytearray);

This works so far, but I don't know the pig syntax to now convert key and value to their respective tuples using a LoadCaster object of my own creation.

Upvotes: 0

Views: 301

Answers (1)

David Parks
David Parks

Reputation: 32081

It seems the answer to this is to use the SequenceFileLoader from Elephant Bird (and be sure not to confuse the one from the Elephant Bird library with the old one from the piggybank library).

The converters are implemented following the pattern of other converters in that same package.

Upvotes: 0

Related Questions