Reputation: 403
So,
import java.io.IOException;
import java.util.Properties;
import org.apache.pig.ExecType;
import org.apache.pig.PigServer;
public class CommitPig {
public static void main(String[] args)
{
try{
String pigScript = "category_count.pig";
pigScriptReader psReader = new pigScriptReader();
psReader.readPigScript( pigScript );
} catch ( IOException e){
e.printStackTrace();
}
try{
Properties props = new Properties();
props.setProperty("fs.default.name", "<server id>");
props.setProperty("mapred.job.tracker.http.address", "<server id>");
props.setProperty("<server id> ");
PigServer pigServer = new PigServer( ExecType.MAPREDUCE, props);
runIdQuery(pigServer,"<input location>");
} catch ( Exception e){
e.printStackTrace();
}
}
private static void runIdQuery(PigServer pigServer, String inputFile) throws IOException {
pigServer.registerQuery("A = load '" + inputFile + "' using PigStorage(' ');");
pigServer.registerQuery("B = filter A BY $0 == 'testing';");
pigServer.store("B","id.out");
}
}
This is the code im running so far.
I am trying to connect to a cluster server from local using Java, to run pig queries.
It is giving me error
ERROR 4010: Cannot find hadoop configurations in classpath (neither hadoop-site.xml nor core-site.xml was found in the classpath)
I tried to set up the classpath, in the cluster by following the instruction from apache.
Running the Pig Scripts in Mapreduce Mode To run the Pig scripts in mapreduce mode, do the following:
Set the PIG_CLASSPATH environment variable to the location of the cluster configuration directory (the directory that contains the core-site.xml, hdfs-site.xml and mapred-site.xml files):
export PIG_CLASSPATH=/mycluster/conf
Set the HADOOP_CONF_DIR environment variable to the location of the cluster configuration directory:
export HADOOP_CONF_DIR=/mycluster/conf
However i am still getting the same error. Am I understanding something wrong here? Can Someone help me understand what exactly the issue here is and how to solve it?
Thank you !
Upvotes: 2
Views: 9769
Reputation: 1
I include the hadoop configuration file (core-site.xml and mapred-site.xml) in the maven's pom.xml.
<build>
...
<resources>
<resource>
<director>[hadoop-directory]/etc/hadoop</directory>
<includes>
<include>core-site.xml</include>
<include>mapred-site.xml</include>
</includes>
</resource>
</resources>
...
</build>
Upvotes: 0
Reputation: 11
you have to set the property "pig.use.overriden.hadoop.configs" to true in your Properties file and PigServer will use properties defined in your file instead of looking for configuration files in your classpath
Upvotes: 1
Reputation: 813
Doing
export HADOOP_HOME=/path/to/hadoop
and running pig again fixed it for me.
Upvotes: 2
Reputation: 339
Please add the conf folder as an parameter for -classpath. That should work
-classpath /home/nubes/pig/conf:/home/nubes/hadoop/conf;
Upvotes: 1
Reputation: 1146
Attempt:
HADOOP_CLASSPATH=/mycluster/conf
You may also check your hadoop-env.sh script to see what the classpath is set to there.
Upvotes: 1