EDJ
EDJ

Reputation: 485

How Do I transpose columns and rows in PIG

I'm not sure if this can be done with builtin PIG scripts or I'll need to code a UDF. But I have essentially a table where I simply want to transpose the data.

Simple put, given:

(1, 2, 3, 4, 5)
(6, 7, 8, 9, 10)
(11, 12, 13, 14, 15)
 ... 300 plus more tuples

I would end up with:

(1,6,11,...) -> goes on for a few hundred more
(2,7,12,...)
(3,8,13,...)
(4,9,14,...)
(5,10,15,...)

Any suggestions on how I could accomplish this?

Upvotes: 0

Views: 1866

Answers (1)

reo katoa
reo katoa

Reputation: 5801

This is not possible with Pig, nor does it make much sense for it to be. Remember that a relation is a bag of tuples, and by definition, a bag is not guaranteed to have its tuples in any specific order. You might start with

(1, 2, 3, 4, 5)
(6, 7, 8, 9, 10)
(11, 12, 13, 14, 15)

but from Pig's perspective there is no difference between this and

(11, 12, 13, 14, 15)
(1, 2, 3, 4, 5)
(6, 7, 8, 9, 10)

which means that "transpose" is ill-defined. Look at it this way -- if you transpose twice, you should end up with the same data structure back, but because the tuples can be reordered along the way, this is not guaranteed to happen.

In the end, if you really must do matrix operations, you would be better off using a tool that respects ordering in both rows and columns.

That said, what are you trying to accomplish?

Upvotes: 1

Related Questions