Uno
Uno

Reputation: 543

Example and more explanation about LoadFunc

Where can I find more information/example about LoadFunc. Except for the http://web.archive.org/web/20130701024312/http://ofps.oreilly.com/titles/9781449302641/load_and_store_funcs.html I dont see any examples that use the new LoadFunc APis. Can anyone please let me know where I can find some example for writing Load UDF?

Upvotes: 2

Views: 2585

Answers (1)

AvkashChauhan
AvkashChauhan

Reputation: 20571

As of 0.7.0, Pig loaders extend the LoadFunc abstract class.This means they need to override 4 methods:

  • getInputFormat() this method returns to the caller an instance of the InputFormat that the loader supports. The actual load process needs an instance to use at load time, and doesn't want to place any constraints on how that instance is created.

  • prepareToRead() is called prior to reading a split. It passes in the reader used during the reads of the split, as well as the actual split. The implementation of the loader usually keeps the reader, and may want to access the actual split if needed.

  • setLocation() Pig calls this to communicate the load location to the loader, which is responsible for passing that information to the underlying InputFormat object. This method can be called multiple times, so there should be no state associated with the method (unless that state gets reset when the method is called).

  • getNext() Pig calls this to get the next tuple from the loader once all setup has been done. If this method returns a NULL, Pig assumes that all information in the split passed via the prepareToRead() method has been processed.

Here are a few nice articles to write Custom Load Function for Pig:

Upvotes: 6

Related Questions