Aslan986
Aslan986

Reputation: 10314

Cast error in Pig join

I have a script that performs a JOIN; when I run it on small data it succeeds, but when I increase the data size I get this error:

14/10/07 19:10:19 ERROR executionengine.Launcher: Backend error message
org.apache.pig.backend.executionengine.ExecException: ERROR 0: Exception while executing [POProject (Name: Project[tuple][0] - scope-577 Operator Key: scope-577) children: null at []]: java.lang.ClassCastException: java.lang.Long cannot be cast to org.apache.pig.data.Tuple
        at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:339)
        at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POLocalRearrange.getNextTuple(POLocalRearrange.java:304)
        at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POUnion.getNextTuple(POUnion.java:167)
        at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.runPipeline(PigGenericMapBase.java:282)
        at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:277)
        at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:64)
        at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:364)
        at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
        at org.apache.hadoop.mapred.Child.main(Child.java:249)
Caused by: java.lang.ClassCastException: java.lang.Long cannot be cast to org.apache.pig.data.Tuple
        at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNextTuple(POProject.java:475)
        at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:334)
        ... 13 more

I guess that the problem is not due to error in the input, rather to their size (dataset of medium size do not run on a development server, but runs on a bigger cluster).

Can you help me understanding the cause of the error?

Upvotes: 0

Views: 443

Answers (1)

Joshua Martell
Joshua Martell

Reputation: 7212

My guess is that there's a row in the large dataset that is a Long value instead of a tuple. This is causing the cast exception. Posting your pig script and a few example rows would also be helpful.

Upvotes: 1

Related Questions