Gr-Disarray
Gr-Disarray

Reputation: 534

How to load files recursively using apache pig

I am pretty new to Pig and I have a a very basic question : can I make make Pig load all files from a directory including the ones in the subfolders ? Here is how I proceed :

records = LOAD '/worldwide/data/' USING PigStorage() AS (event:chararray, user:chararray);

Here, repo/data may have subfolders such as

repo/data/region/cluster1
repo/data/region/cluster2 

Can I get it to load everything from both those subdirectories and any new directories that might get added at a future date?

Upvotes: 5

Views: 1731

Answers (1)

Gr-Disarray
Gr-Disarray

Reputation: 534

confirmed that the above statement just works and loads all of the data from the subdirectories into the records variable.

Upvotes: 2

Related Questions