Reputation: 324
Hello I have a directory with sub-directory similar to this a1,a2,..a8.
and each of this directory has multiple files like
bat-a1-0-0
bat-a1-0-1
bat-a1-1-0
bat-a1-1-1
...
bat-a1-31-0
bat-a1-31-1
and for sub-directory a2 its similar
bat-a2-0-0
bat-a2-0-1
bat-a2-1-0
bat-a2-1-1
...
bat-a2-31-0
bat-a2-31-1
What I decide to do in order not to complicate things is to have multiple LOAD statement to load each directory and find a way to UNION to get all. But I do not know how to load the files in each of the directory using Apache Pig version 0.10.0-cdh4.2.1
since they seem not to follow a simple pattern. Need helps thanks.
Upvotes: 2
Views: 383
Reputation: 21561
In fact this may be simpler than you think. If you load in files in pig, you can simply point to a directory, and pig will recursively load all files. Even those which may be deeply nested.
So the solution is: Make sure all your data is under 1 (or a few) directories, and load them in.
Upvotes: 1