MoustafaAAtta
MoustafaAAtta

Reputation: 1081

How to append to an hdfs file on an extremely small cluster (3 nodes or less)

I am trying to append to a file on an hdfs on a single node cluster. I also tried on a 2 node cluster but get the same exceptions.

In hdfs-site, I have dfs.replication set to 1. If I set dfs.client.block.write.replace-datanode-on-failure.policy to DEFAULT I get the following exception

java.io.IOException: Failed to replace a bad datanode on the existing pipeline due to no more good datanodes being available to try. (Nodes: current=[10.10.37.16:50010], original=[10.10.37.16:50010]). The current failed datanode replacement policy is DEFAULT, and a client may configure this via 'dfs.client.block.write.replace-datanode-on-failure.policy' in its configuration.

If I follow the recommendation in the comment for the configuration in hdfs-default.xml for extremely small clusters (3 nodes or less) and set dfs.client.block.write.replace-datanode-on-failure.policy to NEVER I get the following exception:

org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.SafeModeException): Cannot append to file/user/hadoop/test. Name node is in safe mode.
The reported blocks 1277 has reached the threshold 1.0000 of total blocks 1277. The number of live datanodes 1 has reached the minimum number 0. In safe mode extension. Safe mode will be turned off automatically in 3 seconds.

Here's how I try to append:

Configuration conf = new Configuration();
conf.set("fs.defaultFS", "hdfs://MY-MACHINE:8020/user/hadoop");
conf.set("hadoop.job.ugi", "hadoop");

FileSystem fs = FileSystem.get(conf);
OutputStream out = fs.append(new Path("/user/hadoop/test"));

PrintWriter writer = new PrintWriter(out);
writer.print("hello world");
writer.close();

Is there something I am doing wrong in the code? maybe, there is something missing in the configuration? Any help will be appreciated!

EDIT

Even though that dfs.replication is set to 1, when I check the status of the file through

FileStatus[] status = fs.listStatus(new Path("/user/hadoop"));

I find that status[i].block_replication is set to 3. I don't think that this the problem because when I changed the value of dfs.replication to 0 I got a relevant exception. So apparently it does indeed obey the value of dfs.replication but to be on the safe side, is there a way to change the block_replication value per file?

Upvotes: 13

Views: 4536

Answers (2)

user1002065
user1002065

Reputation: 615

I also faced the same exception as you initially posted and I solved the problem thanks to your comments (set dfs.replication to 1).

But I don't understand something, what happens if I do have replication? In that case isn't it possible to append to a file?

I will appreciate your answer and if you had an experience with it.

Thanks

Upvotes: 1

MoustafaAAtta
MoustafaAAtta

Reputation: 1081

As I mentioned in the edit. Even though the dfs.replication is set to 1, fileStatus.block_replication is set to 3.

A possible solution is to run

hadoop fs -setrep -w 1 -R /user/hadoop/

Which will change the replication factor for each file recursively in the given directory. Documentation for the command can be found here.

What to be done now is to look why the value in hdfs-site.xml is ignored. And how to force the value 1 to be the default.

EDIT

It turns out that the dfs.replication property has to be set in the Configuration instance too, otherwise it requests that the replication factor for the file be the default which is 3 regardless of the value set in hdfs-site.xml

Adding to the code the following statement will solve it.

conf.set("dfs.replication", "1");

Upvotes: 11

Related Questions