lengthy_preamble
lengthy_preamble

Reputation: 434

java.io.IOException: All directories in dfs.datanode.data.dir are invalid

I'm trying to get hadoop and hive to run locally on my linux system, but when I run jps, I noticed that the datanode service is missing:

vaughn@vaughn-notebook:/usr/local/hadoop$ jps
2209 NameNode
2682 ResourceManager
3084 Jps
2510 SecondaryNameNode

If I run bin/hadoop datanode, the following error occurs:

    17/07/13 19:40:14 INFO datanode.DataNode: registered UNIX signal handlers for [TERM, HUP, INT]
    17/07/13 19:40:14 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
    17/07/13 19:40:15 WARN datanode.DataNode: Invalid dfs.datanode.data.dir /home/cloudera/hdata/dfs/data : 
    ExitCodeException exitCode=1: chmod: changing permissions of '/home/cloudera/hdata/dfs/data': Operation not permitted

        at org.apache.hadoop.util.Shell.runCommand(Shell.java:559)
        at org.apache.hadoop.util.Shell.run(Shell.java:476)
        at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:723)
        at org.apache.hadoop.util.Shell.execCommand(Shell.java:812)
        at org.apache.hadoop.util.Shell.execCommand(Shell.java:795)
        at org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:646)
        at org.apache.hadoop.fs.FilterFileSystem.setPermission(FilterFileSystem.java:479)
        at org.apache.hadoop.util.DiskChecker.mkdirsWithExistsAndPermissionCheck(DiskChecker.java:140)
        at org.apache.hadoop.util.DiskChecker.checkDir(DiskChecker.java:156)
        at org.apache.hadoop.hdfs.server.datanode.DataNode$DataNodeDiskChecker.checkDir(DataNode.java:2285)
        at org.apache.hadoop.hdfs.server.datanode.DataNode.checkStorageLocations(DataNode.java:2327)
        at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:2309)
        at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:2201)
        at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:2248)
        at org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:2424)
        at org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:2448)
    17/07/13 19:40:15 FATAL datanode.DataNode: Exception in secureMain
    java.io.IOException: All directories in dfs.datanode.data.dir are invalid: "/home/cloudera/hdata/dfs/data/" 
        at org.apache.hadoop.hdfs.server.datanode.DataNode.checkStorageLocations(DataNode.java:2336)
        at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:2309)
        at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:2201)
        at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:2248)
        at org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:2424)
        at org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:2448)
    17/07/13 19:40:15 INFO util.ExitUtil: Exiting with status 1
    17/07/13 19:40:15 INFO datanode.DataNode: SHUTDOWN_MSG: 
    /************************************************************

SHUTDOWN_MSG: Shutting down DataNode at vaughn-notebook/127.0.1.1

That directory seems unusual, but I don't think there's anything technically wrong with it. Here are the permissions on the directory:

vaughn@vaughn-notebook:/usr/local/hadoop$ ls -ld /home/cloudera/hdata/dfs/data
drwxrwxrwx 2 root root 4096 Jul 13 19:14 /home/cloudera/hdata/dfs/data

I also removed anything in the tmp folder and formatted the hdfs namenode. Here is my hdfs-site file:

<configuration>

<property>
  <name>dfs.replication</name>
  <value>1</value>
  <description>Default block replication.
  The actual number of replications can be specified when the file is created.
  The default is used if replication is not specified in create time.
  </description>
 </property>
 <property>
   <name>dfs.namenode.name.dir</name>
   <value>file:/home/cloudera/hdata/dfs/name</value>
 </property>
 <property>
   <name>dfs.datanode.data.dir</name>
   <value>file:/home/cloudera/hdata/dfs/data</value>
 </property>

</configuration>

And my core-site file:

<configuration>

<property>
        <name>fs.defaultFS</name>
        <value>hdfs://localhost:9000</value>
    </property>
    <property>
        <name>hadoop.tmp.dir</name>
        <value>/home/cloudera/hdata</value>
</property>

</configuration>

In my googling, I've seen some suggest running "sudo chown hduser:hadoop -R /usr/local/hadoop_store", but when I do that I get the error "chown: invalid user: ‘hduser:hadoop’". Do I have to create this user and group? I'm not really familiar with the process. Thanks in advance for any assistance.

Upvotes: 2

Views: 3988

Answers (4)

Aakash Patel
Aakash Patel

Reputation: 557

One more possible reason which was in my case: Location of HDFS directory in folder properties shower user name twice i.e. home/hadoop/hadoop/hdfs So, I had added the same directory in hdfs-site.xml. As a solution, I removed hadoop/ and changed it to home/hadoop/hdfs and this resolved my problem.

Upvotes: 0

Tahir Dibs
Tahir Dibs

Reputation: 3

sudo chown -R /usr/local/hadoop_store

delete datanode namenode directories in hadoop_store

stop-dfs.sh and stop-yarn.sh

hadoop fs namenode -format

start-dfs.sh and start dfs-yarn.sh

Hope it'll help

Upvotes: 0

SachinJose
SachinJose

Reputation: 8522

Looks like it's a permission issue, The user which is used to start datanode should have write access in the data node -data directories.

Try to execute the below command before starting datanode service.

sudo chmod -R 777 /home/cloudera/hdata/dfs

You can also update owner:group using chown command, that's the best option.

Edit

If data node start up still fails try to update the file ownership using the below command before starting data node.

sudo chown -R vaughn.root /home/cloudera/hdata/dfs

Upvotes: 1

vasanth
vasanth

Reputation: 216

1.sudo chown vaughn:hadoop -R /usr/local/hadoop_store

where hadoop is group name. use

grep vaughn /etc/group

in your terminal to see your group name.

2.clean temporary directories.

3.Format the name node.

Hope this helps.

Upvotes: 3

Related Questions