Restarting datanodes after reformating namenode in a hadoop cluster

Question

Using the basic configuration provided in the hadoop setup official documentation, I can run a hadoop cluster and submit mapreduce jobs.

The problem is whenever I stop all the daemons and reformat the namenode, when I subsequently start all the daemons, the datanode does not start.

I've been looking around for a solution and it appears that it is because the formatting only formats the namenode and the disk space for the datanode needs to be erased.

How can I do this? What changes do I need to make to my config files? After those changes are made, how do I delete the correct files when formatting the namenode again?

user3484461 · Accepted Answer

Specifically if you have provided configuration of below 2 parameters which can be defined in hdfs-site.xml

dfs.name.dir: Determines where on the local filesystem the DFS name node should store the name table(fsimage). If this is a comma-delimited list of directories then the name table is replicated in all of the directories, for redundancy.

dfs.data.dir: Determines where on the local filesystem an DFS data node should store its blocks. If this is a comma-delimited list of directories, then data will be stored in all named directories, typically on different devices. Directories that do not exist are ignored

if you have provided the specific directory location for above 2 parameters then you need to delete those directories as well before formating namenode .

if you have not provided the above 2 parameter so by default it gets created under below parameter :

hadoop.tmp.dir which can be configured in core-site.xml

Again if you have specified this parameter then you need to remove that directory before formating namenode .

if you have not defined so by default it gets created in /tmp/hadoop-$username(hadoop) user so you need to remove this directory .

Summary: you have to delete the name node and data node directory before formating the system. By default it gets created at /tmp/ location .

Restarting datanodes after reformating namenode in a hadoop cluster

Answers (1)

Related Questions