Reputation: 143
I have some problems with Hadoop hdfs. (Hadoop 2.7.3) I have 2 namenode (1 active, 1 standby) and 3 datanodes. And replication factor is 3.
$ hdfs dfs -df -h /
Filesystem Size Used Available Use%
hdfs://hadoop-cluster 131.0 T 51.3 T 79.5 T 39%
Used disk is 51T with -df
command.
$ hdfs dfs -du -h /
912.8 G /dir1
2.9 T /dir2
But used disk is about 3T with -du
command.
I found that one of datanodes reached 100% usage.
Live datanodes (3):
datanode1:
Configured Capacity: 48003784114176 (43.66 TB)
DFS Used: 2614091989729 (2.38 TB)
Non DFS Used: 95457946911 (88.90 GB)
DFS Remaining: 45294174318384 (41.19 TB)
DFS Used%: 5.45%
DFS Remaining%: 94.36%
*****datanode2******
Configured Capacity: 48003784114176 (43.66 TB)
DFS Used: 48003784114176 (43.66 TB)
Non DFS Used: 0
DFS Remaining: 0
DFS Used%: 100%
DFS Remaining%: 0%
datanode3:
Configured Capacity: 48003784114176 (43.66 TB)
DFS Used: 2615226250042 (2.38 TB)
Non DFS Used: 87496531142 (81.49 GB)
DFS Remaining: 45301001735984 (41.20 TB)
DFS Used%: 5.45%
DFS Remaining%: 94.37%
My question is
balancer
. It seems to work but no block has been moved for all iterations and it exists without any error. How can I balance the disk usage of datanodes? Why hdfs balancer
command does not move any block?19/11/06 11:27:51 INFO balancer.Balancer: Decided to move 10 GB bytes from datanode2:DISK to datanode3:DISK
19/11/06 11:27:51 INFO balancer.Balancer: chooseStorageGroups for SAME_RACK: overUtilized => belowAvgUtilized
19/11/06 11:27:51 INFO balancer.Balancer: chooseStorageGroups for SAME_RACK: underUtilized => aboveAvgUtilized
19/11/06 11:27:51 INFO balancer.Balancer: chooseStorageGroups for ANY_OTHER: overUtilized => underUtilized
19/11/06 11:27:51 INFO balancer.Balancer: chooseStorageGroups for ANY_OTHER: overUtilized => belowAvgUtilized
19/11/06 11:27:51 INFO balancer.Balancer: chooseStorageGroups for ANY_OTHER: underUtilized => aboveAvgUtilized
19/11/06 11:27:51 INFO balancer.Balancer: Will move 10 GB in this iteration
19/11/06 11:27:51 INFO balancer.Dispatcher: Limiting threads per target to the specified max.
19/11/06 11:27:51 INFO balancer.Dispatcher: Allocating 5 threads per target.
No block has been moved for 5 iterations. Exiting...
Although datanode2
is full, the status of the node is shown as "In-service" or "Live" or "Normal". Surely, I can't write new data in hdfs at this situation.
The result of -df
and the result of -du
is too different. Why?
Upvotes: 1
Views: 759
Reputation: 59
Either add a new data node, or reduce the replication factor.
Why ?
Lets call the most used node in the cluster as alpha, and the remaining two less used nodes as beta, gamma.
Now, lets say you are moving a "file.txt" from alpha node with replication factor 3 to beta node, whats happening here is that the main file is moved to beta node, but in alpha node the replicated file is created. So, the total space used in the alpha node remains constant.
Upvotes: 2