dsingh
dsingh

Reputation: 270

Not able to get rid of famous DataStreamer Exception: org.apache.hadoop.ipc.RemoteException: error

I am running hadoop cluster with Ubuntu host as master-slave and virtual machine running on it as another slave(2 node cluster).

It seems the solution to the problem which is supposed to be resolved at No data nodes are started is not working for me. I tried both the solutions explained there.

It seems that when i manually equate the namespace ids of the affected datanodes to name node and start the cluster(solution 2 in the linked post) i still get the same error( DataStreamer Exception). Next the logs of one of the datanode shows the same Incompatible namespaceIDs error, but the namespace id of data node which is shown in the log is different from that present my tmp/dfs/data/current/version file(which is not changed and is same as that of tmp/dfs/name/current/version)

After many hours of debugging i am still clueless :(.

PS:

I performed a simple test after all this

14/05/04 04:12:54 WARN hdfs.DFSClient: DataStreamer Exception: org.apache.hadoop.ipc.RemoteException: java.io.IOException: File /user/dsingh/mysample could only be replicated to 0 nodes, instead of 1 at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1920) at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:783) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:587) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1432) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1428) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1426)

at org.apache.hadoop.ipc.Client.call(Client.java:1113) at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:229) at com.sun.proxy.$Proxy1.addBlock(Unknown Source) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:85) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:62) at com.sun.proxy.$Proxy1.addBlock(Unknown Source) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.locateFollowingBlock(DFSClient.java:3720) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:3580) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2600(DFSClient.java:2783) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:3023)

14/05/04 04:12:54 WARN hdfs.DFSClient: Error Recovery for null bad datanode[0] nodes == null 14/05/04 04:12:54 WARN hdfs.DFSClient: Could not get block locations. Source file "/user/dsingh/mysample" - Aborting... put: java.io.IOException: File /user/dsingh/mysample could only be replicated to 0 nodes, instead of 1 14/05/04 04:12:54 ERROR hdfs.DFSClient: Failed to close file /user/dsingh/mysample org.apache.hadoop.ipc.RemoteException: java.io.IOException: File /user/dsingh/mysample could only be replicated to 0 nodes, instead of 1 at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1920) at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:783) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:587) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1432) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1428) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1426)

at org.apache.hadoop.ipc.Client.call(Client.java:1113) at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:229) at com.sun.proxy.$Proxy1.addBlock(Unknown Source) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:85) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:62) at com.sun.proxy.$Proxy1.addBlock(Unknown Source) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.locateFollowingBlock(DFSClient.java:3720) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:3580) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2600(DFSClient.java:2783) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:3023)

Any clue will help me .

Upvotes: 1

Views: 6541

Answers (3)

Mehedee Hassan
Mehedee Hassan

Reputation: 133

I have faced the same problem. This is all about the lack of space dedicated to hdfs. I have 10 virtual machine (vmware) nodes which has 3.5 GB average storage for hdfs.I am using hadoop 2.6.

You can decrease the number of replication by going your "_hadoop_location/etc/hadoop/hdfs-site.xml" (for hadoop 2.6) configuration file's value of "dfs.replication" property.You can decrease to small number (like 1 or 2) and then try to keep file smaller than your total space.

If it shows the same problem try file size smaller than you used last time or recreate machine with larger disk size.

May be late but it may help others who faced the same problem :) Thank you.

Upvotes: 0

dsingh
dsingh

Reputation: 270

After working for few hours on this issue. I finally gave up and it is still unresolved in my universe of knowledge.]

But the good thing is that instead of using a virtual box as slave on the same machine I connected another ubuntu machine with my master and every thing worked like charm :) The problem i guess could be related to limited virtual memory allocation for storage in Virtual machine(It was less than 500Mb) in my case and i have read somewhere that each node in the cluster should have atleast 10 GB of free space to keep HDFS happy.

My take away if possible try the hadoop cluster on 2 separate machines rather than using Virtual machine on same host

Upvotes: 1

Jing Wang
Jing Wang

Reputation: 50

After you did -copyFromLocal, seems like the Datanode was up to get request to write the file. However, it wasn't able to allocate the blocks needed for the file. Please check the Datanode log to see exactly what happened. Also, run "hdfs dfsadmin -report" to make sure you have enough space on the Datanode.

Upvotes: 0

Related Questions