JaemyeongEo
JaemyeongEo

Reputation: 403

Cannot find Hadoop Configuration in classpath, running mapreduce in server from local using Java

So,

import java.io.IOException;
import java.util.Properties;

import org.apache.pig.ExecType;
import org.apache.pig.PigServer;


public class CommitPig {

    public static void main(String[] args)
    {
        try{
        String pigScript = "category_count.pig";
        pigScriptReader psReader = new pigScriptReader();
        psReader.readPigScript( pigScript );
        } catch ( IOException e){
            e.printStackTrace();
        }

        try{
            Properties props = new Properties();
            props.setProperty("fs.default.name", "<server id>");
            props.setProperty("mapred.job.tracker.http.address", "<server id>");
            props.setProperty("<server id> ");
            PigServer pigServer = new PigServer( ExecType.MAPREDUCE, props); 
            runIdQuery(pigServer,"<input location>");

        } catch ( Exception e){
            e.printStackTrace();
        }

    }

    private static void runIdQuery(PigServer pigServer, String inputFile) throws IOException {

        pigServer.registerQuery("A = load '" + inputFile + "' using PigStorage(' ');");
        pigServer.registerQuery("B = filter A BY $0 == 'testing';");
        pigServer.store("B","id.out");

    }
}

This is the code im running so far.

I am trying to connect to a cluster server from local using Java, to run pig queries.

It is giving me error

ERROR 4010: Cannot find hadoop configurations in classpath (neither hadoop-site.xml nor core-site.xml was found in the classpath)

I tried to set up the classpath, in the cluster by following the instruction from apache.

Running the Pig Scripts in Mapreduce Mode To run the Pig scripts in mapreduce mode, do the following:

  1. Set the PIG_CLASSPATH environment variable to the location of the cluster configuration directory (the directory that contains the core-site.xml, hdfs-site.xml and mapred-site.xml files): export PIG_CLASSPATH=/mycluster/conf

  2. Set the HADOOP_CONF_DIR environment variable to the location of the cluster configuration directory: export HADOOP_CONF_DIR=/mycluster/conf

However i am still getting the same error. Am I understanding something wrong here? Can Someone help me understand what exactly the issue here is and how to solve it?

Thank you !

Upvotes: 2

Views: 9769

Answers (6)

halayudha
halayudha

Reputation: 1

I include the hadoop configuration file (core-site.xml and mapred-site.xml) in the maven's pom.xml.

<build>
...
<resources>
<resource>
<director>[hadoop-directory]/etc/hadoop</directory>
<includes>
<include>core-site.xml</include>
<include>mapred-site.xml</include>
</includes>
</resource>
</resources>
...
</build>

Upvotes: 0

Walker Rowe
Walker Rowe

Reputation: 973

export HADOOP_CLASSPATH=$HADOOP_HOME/etc/hadoop

Upvotes: 0

Francesco Masucci
Francesco Masucci

Reputation: 11

you have to set the property "pig.use.overriden.hadoop.configs" to true in your Properties file and PigServer will use properties defined in your file instead of looking for configuration files in your classpath

Upvotes: 1

Rafi
Rafi

Reputation: 813

Doing

export HADOOP_HOME=/path/to/hadoop

and running pig again fixed it for me.

Upvotes: 2

Nag
Nag

Reputation: 339

Please add the conf folder as an parameter for -classpath. That should work

 -classpath /home/nubes/pig/conf:/home/nubes/hadoop/conf;

Upvotes: 1

Engineiro
Engineiro

Reputation: 1146

Attempt:

HADOOP_CLASSPATH=/mycluster/conf

You may also check your hadoop-env.sh script to see what the classpath is set to there.

Upvotes: 1

Related Questions