Marta Karas
Marta Karas

Reputation: 5175

Hadoop Pseudo-Distributed Operation error: Protocol message tag had invalid wire type

I am setting up a Hadoop 2.6.0 Single Node Cluster. I follow the hadoop-common/SingleCluster documentation. I work on Ubuntu 14.04. So far I have managed to run Standalone Operation successfully.

I face an error when trying to perform Pseudo-Distributed Operation. I managed to start NameNode daemon and DataNode daemon. jps oputut:

martakarass@marta-komputer:/usr/local/hadoop$ jps
4963 SecondaryNameNode
4785 DataNode
8400 Jps
martakarass@marta-komputer:/usr/local/hadoop$ 

But when I try to make the HDFS directories required to execute MapReduce jobs, I receive the following error:

martakarass@marta-komputer:/usr/local/hadoop$ bin/hdfs dfs -mkdir /user
15/05/01 20:36:00 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
mkdir: Failed on local exception: com.google.protobuf.InvalidProtocolBufferException: Protocol message tag had invalid wire type.; Host Details : local host is: "marta-komputer/127.0.0.1"; destination host is: "localhost":9000; 
martakarass@marta-komputer:/usr/local/hadoop$ 

(I believe I can ignore the WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... warning at this point.)


When it comes to Hadoop config files, I changed only the files mentioned in the documentation. I have:

etc/hadoop/core-site.xml:

<configuration>
    <property>
        <name>fs.defaultFS</name>
        <value>hdfs://localhost:9000</value>
    </property>
</configuration>

etc/hadoop/hdfs-site.xml:

<configuration>
    <property>
        <name>dfs.replication</name>
        <value>1</value>
    </property>
</configuration>

I managed to connect to localhost:

martakarass@marta-komputer:~$ ssh localhost
martakarass@localhost's password: 
Welcome to Ubuntu 14.04.1 LTS (GNU/Linux 3.13.0-45-generic x86_64)

 * Documentation:  https://help.ubuntu.com/

Last login: Fri May  1 20:28:58 2015 from localhost

I formatted the filesystem:

martakarass@marta-komputer:/usr/local/hadoop$  bin/hdfs namenode -format
15/05/01 20:30:21 INFO namenode.NameNode: STARTUP_MSG: 
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG:   host = marta-komputer/127.0.0.1
STARTUP_MSG:   args = [-format]
STARTUP_MSG:   version = 2.6.0
(...)
15/05/01 20:30:24 INFO namenode.NameNode: SHUTDOWN_MSG: 
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at marta-komputer/127.0.0.1
************************************************************/

/etc/hosts:

127.0.0.1       localhost
127.0.0.1       marta-komputer

# The following lines are desirable for IPv6 capable hosts
::1     ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters

etc/hostname:

marta-komputer

Upvotes: 5

Views: 11535

Answers (4)

Rajesh N
Rajesh N

Reputation: 2574

Do these changes in /etc/hosts:

1. Change:

127.0.0.1    localhost
127.0.0.1    marta-komputer

to one line

127.0.0.1    localhost    marta-komputer

2. Delete: (if exists)

127.0.1.1    marta-komputer

3. Add:

your-system-ip    marta-komputer

To find your system IP, type this in terminal

ifconfig

(find your IP address here) or type this:

ifdata -pa eth0

Your final /etc/hosts file should look like:

127.0.0.1       localhost       marta-komputer
your-system-ip       marta-komputer

# The following lines are desirable for IPv6 capable hosts
::1     ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters

Change hdfs-site.xml:

1. Change:

hdfs://localhost:9000

to

hdfs://marta-komputer:9000

Now, stop and start hadoop processes.

Your jps command should list these processes:

Namenode
Datanode
TaskTracker
SecondaryNameNode

If it does not list all these processes, check respective logs for errors.

UPDATE:

  1. Follow this tutorial here

  2. If the problem persists, it might be due to permission issue.

UPDATE II:

  1. Create a directory and change permissions for namenode and datanode:

sudo mkdir -p /usr/local/hdfs/namenode

sudo mkdir -p /usr/local/hdfs/datanode

sudo chown -R hduser:hadoop /usr/local/hdfs/namenode

sudo chown -R hduser:hadoop /usr/local/hdfs/datanode

  1. Add these properties in hdfs-site.xml:

dfs.datanode.data.dir with value /usr/local/hdfs/datanode

dfs.namenode.data.dir with value /usr/local/hdfs/namenode

  1. Stop and start hadoop processes.

Upvotes: 1

Dimitris Fasarakis Hilliard
Dimitris Fasarakis Hilliard

Reputation: 160657

This is a set of steps I followed on Ubuntu when facing exactly the same problem but with 2.7.1, the steps shouldn't differ much for previous and future version (I'd believe).

1) Format of my /etc/hosts folder:

    127.0.0.1    localhost   <computer-name>
    # 127.0.1.1    <computer-name>
    <ip-address>    <computer-name>

    # Rest of file with no changes

2) *.xml configuration files (displaying contents inside <configuration> tag):

  • For core-site.xml:

        <property>
            <name>fs.defaultFS</name>
            <value>hdfs://localhost/</value>
        </property>
        <!-- set value to a directory you want with an absolute path -->
        <property>
            <name>hadoop.tmp.dir</name>
            <value>"set/a/directory/on/your/machine/"</value>
            <description>A base for other temporary directories</description>
        </property>
    
  • For hdfs-site.xml:

        <property>
            <name>dfs.replication</name>
            <value>1</value>
        </property>
    
  • For yarn-site.xml:

        <property>
            <name>yarn.recourcemanager.hostname</name>
            <value>localhost</value>
        </property>
    
        <property>
            <name>yarn.nodemanager.aux-services</name>
            <value>mapreduce_shuffle</value>
        </property>
    
  • For mapred-site.xml:

        <property>
            <name>mapreduce.framework.name</name>
            <value>yarn</value>
        </property>
    

3) Verify $HADOOP_CONF_DIR:

This is a good opportunity to verify that you are indeed using this configuration. In the folder where your .xml files reside, view contents of script hadoop_env.sh and make sure $HADOOP_CONF_DIR is pointing at the right directory.

4) Check your PORTS:

NameNode binds ports 50070 and 8020 on my standard distribution and DataNode binds ports 50010, 50020, 50075 and 43758. Run sudo lsof -i to be certain no other services are using them for some reason.

5) Format if necessary:

At this point, if you have changed the value hadoop.tmp.dir you should reformat the NameNode by hdfs namenode -format. If not remove the temporary files already present in the tmp directory you are using (default /tmp/):

6) Start Nodes and Yarn:

In /sbin/ start the name and data node by using the start-dfs.sh script and yarn with start-yarn.sh and evaluate the output of jps:

    ./start-dfs.sh   
    ./start-yarn.sh

At this point if NameNode, DataNode, NodeManager and ResourceManager are all running you should be set to go!

If any of these hasn't started, share the log output for us to re-evaluate.

Upvotes: 4

Pravin yadav
Pravin yadav

Reputation: 707

i got this error once when i was uploading files to hdfs from java code ,The issue was that i was using hadoop 1 jar to connect to hadoop 2 installation,not sure whats the problem in your case but if you ever configured hadoop 1 eariler then something must be messing with it

Upvotes: 0

Yosser Goupil
Yosser Goupil

Reputation: 799

remove 127.0.0.1 localhost from /etc/hosts and change your core-site.xml like follow:

<configuration>
    <property>
        <name>fs.defaultFS</name>
        <value>hdfs://marta-komputer:9000</value>
    </property>
</configuration>

and you can ignore the WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... warning

Upvotes: 1

Related Questions