Mohammed47
Mohammed47

Reputation: 117

Pig + Cassandra : ERROR 1070

I'm using hadoop 1.0.4, cassandra 1.2.2 and pig 0.11.0.

i want to run this script on the grunt:

**grunt> rows = LOAD 'cassandra://Keyspace1/Users' USING CassandraStorage() AS (key, columns: bag {T: tuple(name, value)});**

but i'm having this error:

**2013-03-19 11:15:54,957 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1070: Could not resolve CassandraStorage using imports: [, org.apache.pig.builtin., org.apache.pig.impl.builtin.]**

the log file contain :

Pig Stack Trace

ERROR 1070: Could not resolve CassandraStorage using imports: [, org.apache.pig.builtin., org.apache.pig.impl.builtin.]

Failed to parse: Pig script failed to parse: pig script failed to validate: org.apache.pig.backend.executionengine.ExecException: ERROR 1070: Could not resolve CassandraStorage using imports: [, org.apache.pig.builtin., org.apache.pig.impl.builtin.] at org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:191) at org.apache.pig.PigServer$Graph.validateQuery(PigServer.java:1571) at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1544) at org.apache.pig.PigServer.registerQuery(PigServer.java:516) at org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:991) at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:412) at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:194) at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:170) at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:69) at org.apache.pig.Main.run(Main.java:538) at org.apache.pig.Main.main(Main.java:157) Caused by: pig script failed to validate: org.apache.pig.backend.executionengine.ExecException: ERROR 1070: Could not resolve CassandraStorage using imports: [, org.apache.pig.builtin., org.apache.pig.impl.builtin.] at org.apache.pig.parser.LogicalPlanBuilder.validateFuncSpec(LogicalPlanBuilder.java:1209) at org.apache.pig.parser.LogicalPlanBuilder.buildFuncSpec(LogicalPlanBuilder.java:1194) at org.apache.pig.parser.LogicalPlanGenerator.func_clause(LogicalPlanGenerator.java:4766) at org.apache.pig.parser.LogicalPlanGenerator.load_clause(LogicalPlanGenerator.java:3183) at org.apache.pig.parser.LogicalPlanGenerator.op_clause(LogicalPlanGenerator.java:1315) at org.apache.pig.parser.LogicalPlanGenerator.general_statement(LogicalPlanGenerator.java:799) at org.apache.pig.parser.LogicalPlanGenerator.statement(LogicalPlanGenerator.java:517) at org.apache.pig.parser.LogicalPlanGenerator.query(LogicalPlanGenerator.java:392) at org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:184) ... 10 more Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 1070: Could not resolve CassandraStorage using imports: [, org.apache.pig.builtin., org.apache.pig.impl.builtin.] at org.apache.pig.impl.PigContext.resolveClassName(PigContext.java:523) at org.apache.pig.parser.LogicalPlanBuilder.validateFuncSpec(LogicalPlanBuilder.java:1206)

... 18 more

thanks.

Upvotes: 4

Views: 2750

Answers (3)

kenlz
kenlz

Reputation: 461

mine was solved by doing this

register hdfs:/udf/cassandra-all.jar;
define CqlStorage org.apache.cassandra.hadoop.pig.CqlNativeStorage();

Upvotes: 0

Ed Wei
Ed Wei

Reputation: 56

This is definitely a PIG_CLASSPATH problem. You should be running pig_cassandra from the examples/pig/bin directory that came with the cassandra source distribution. this script builds the classpath for you before running pig.

You also need to set the following env variables:

export JAVA_HOME=Oracle java 6 dir
export PIG_HOME=pig directory
export PIG_CONF_DIR=hadoop conf directory(needed if running distributed mapreduce)
export PIG_INITIAL_ADDRESS=ip of a cassandra node
export PIG_RPC_PORT=cassandra RPC port (i.e. 9160)
export PIG_PARTITIONER=org.apache.cassandra.dht.RandomPartitioner

Note: you must build the cassandra source using ant once before running pig_cassandra. This will generate some libs in the cassandra_source/build/lib/jars folder which the pig_cassandra script needs. otherwise, you will get errors starting pig. Can't remember what exactly the error was. It was something along the lines of a method not found during serialization/deserialization stage inside pig.

Upvotes: 2

Lorand Bendig
Lorand Bendig

Reputation: 10650

Based on the Pygmalion project's documentation and the source of the pig_cassandra script you can establish the connection between Cassandra and Pig by doing the followings:

for jar in $CASSANDRA_HOME/lib/*.jar; do CLASSPATH=$CLASSPATH:$jar; done;
export PIG_CLASSPATH=$PIG_CLASSPATH:$CLASSPATH;
export PIG_OPTS="$PIG_OPTS -Dudf.import.list=org.apache.cassandra.hadoop.pig";
export PIG_INITIAL_ADDRESS=localhost;
export PIG_RPC_PORT=9160;
export PIG_PARTITIONER=org.apache.cassandra.dht.RandomPartitioner;
pig

Also make sure to include the Cassandra jars to the HADOOP_CLASSPATH as well (e.g: set it in hadoop-env.sh)

Upvotes: 4

Related Questions