Reputation: 137
I am getting below error:
Diagnostics: org.apache.hadoop.hdfs.BlockMissingException: Could not obtain block: BP-467931813-10.3.20.155-1514489559979:blk_1073741991_1167 file=/user/oozie/share/lib/lib_20171228193421/oozie/hadoop-auth-2.7.2-amzn-2.jar
Failing this attempt. Failing the application.
Although I have set replication factor 3 for /user/oozie/share/lib/ directory. All the jars under this path are available on 3 datanode but few jars are missing. Can any body suggest why this is happening and how to prevent this.
Upvotes: 8
Views: 23561
Reputation: 158
Got the same error when using Trino to connect to hive, I tried to connect HDFS from a Trino worker and found that port 9866 is not open on HDFS, opened the port on HDFS datenode and solved the problem. Related document: https://www.ibm.com/docs/en/spectrum-scale-bda?topic=requirements-firewall-recommendations-hdfs-transparency https://hadoop.apache.org/docs/r3.0.0/hadoop-project-dist/hadoop-hdfs/hdfs-default.xml
Upvotes: 0
Reputation: 1
please check the file's owner in hdfs directory, I met this issue because the owner is "root", it got solved when I changed it to "your_user".
Upvotes: 0
Reputation: 21
I was getting the same exception while trying to read a file from hdfs. The solution under the section "Clients use Hostnames when connecting to DataNodes" from this link worked for me: https://hadoop.apache.org/docs/r2.7.2/hadoop-project-dist/hadoop-hdfs/HdfsMultihoming.html#Clients_use_Hostnames_when_connecting_to_DataNodes
I added this XML block to "hdfs-site.xml" and restarted the datanode and namenode servers:
<property>
<name>dfs.client.use.datanode.hostname</name>
<value>true</value>
<description>Whether clients should use datanode hostnames when
connecting to datanodes.
</description>
</property>
Upvotes: 1