Reputation: 1
I have been using this for loading one text file
A = LOAD '1try.txt' USING PigStorage(' ') as (c1:chararray,c2:chararray,c3:chararray,c4:chararray);
Upvotes: 0
Views: 13982
Reputation: 28209
data = load '/FOLDER/PATH' using PigStorage(' ') AS (<name> <type>, ..);
OR
data = load '/FOLDER/PATH' using HBaseStorage();
Upvotes: 0
Reputation: 173
Here is the link to the official pig documentation that indicates that you can use the load statement to load all the files in a directory: http://pig.apache.org/docs/r0.14.0/basic.html#load
Syntax: LOAD 'data' [USING function] [AS schema];
Where: 'data': The name of the file or directory, in single quotes. If you specify a directory name, all the files in the directory are loaded.
Upvotes: 1
Reputation: 926
You can use folder name instead of file name, like this:
A = LOAD 'myfolder' USING PigStorage(' ')
AS (c1:chararray,c2:chararray,c3:chararray,c4:chararray);
Pig will load all files in the specified folder, as stated in Programming Pig:
When specifying a “file” to read from HDFS, you can specify directories. In this case, Pig will find all files under the directory you specify and use them as input for that load statement. So, if you had a directory input with two datafiles today and yesterday under it, and you specified input as your file to load, Pig will read both today and yesterday as input. If the directory you specify has other directories, files in those directories will be included as well.
Upvotes: 4