Reputation: 5346
Can I do something like this in Pig Latin?
data1 = LOAD 'hadoop/text1.txt' AS (line:chararray);
data2 = LOAD 'hadoop/text2.txt' AS (line:chararray);
mixed = FOREACH data1, data2 GENERATE data1:line, data2:line;
Upvotes: 0
Views: 442
Reputation: 1013
In general, it wouldn't make sense to do what you are asking, as the data will be loaded by multiple mappers, perhaps one line at a time. There is no guarantee that the corresponding lines will be seen by the same mapper, and no guarantee that the mappers know what line of what block they are reading. As WinnieNicklaus mentioned, the best thing to do is to label the lines and do a join.
Upvotes: 2