Reputation: 389
I have a cluster composed by a master
node (which runs only the namenode) and two slaves, namely slave1
and slave2
(which run the datanodes). Now, I want to add a new hard drive only to slave1
, and use it to increase the datanode capacity. I followed different tutorials and howto on internet, and I understood how to to it in general. My probem is that adding that partition/hard drive only to slave1
raise problems, as the path to the new partitions/hard drive added in the hdfs-site.xml
won't be found by the slave2
.
This is what I do on slave1
(the new disk is on sdb
):
fdisk /dev/sdb
to create the partition. The process ends without problem, creating /dev/sdb1
.sdb1
with mkfs.ext4 /dev/sdb1
.sdb1
on /disk1
with mount /dev/sdb1 /disk1
my/user/hdfs/datanode
inside /disk1
/disk1/my/user/
to give permission to my userhadoop-daemon.sh stop datanode
/disk1/my/user/hdfs/datanode
to the hdfs-site.xml
, under the dfs.datanode.data.dir
field, using comma to separete with the other path already present there. I do this on every machine.Now, if I stop and start again the HDFS from the master, what happen is that the datanode on slave2
won't start because it cannot find the path /disk1/my/user/hdfs/datanode
. My guess is then: is it possible to add a new partition/hard drive only to one datanode in the cluster? What have I to do? Create the same folder on each machine mandatorily?
Upvotes: 0
Views: 1162
Reputation: 11479
If you have the two slaves running on two separate hardwares, then you can create separate hdfs-site.xml
for each of these slaves. On slave1
it will have the additional disk listed in datanode.data.dir
while slave2
will not have it.
Upvotes: 1