Reputation: 11
I have two files. And I want to merge it sequentially. How can I do so using Pig/PigLatin script?
f1.csv
1,aa
1,aa
1,ab
1,ac
2,bd
2,bd
2,bd
4,ab
4,bc
f2.csv
1,xxx
1,xxy
1,xyx
1,yxx
1,xyy
1,yyx
2,pqr
2,pq
2,pqrs
2,pqs
3,def
And the output i need is
1,aa,1,xxy
1,aa,1,xyx
1,ab,1,yxx
1,ac,1,xyy
2,bd,2,pqr
2,bd,2,pq
2,bd,2,pqrs
Can anyone help me which join should be used and how to get this?
Upvotes: 1
Views: 5154
Reputation: 1916
1) LOAD each file.
2) Then UNION them together
http://pig.apache.org/docs/r0.7.0/piglatin_ref2.html#UNION
3) STORE the new unioned alias.
P.S. You can SET DEFAULT_PARALLEL 1; to make sure you only output one file.
Upvotes: 3