Reputation: 1577
I have the NameNode service installed on a machine called hadoop
.
The core-site.xml
file has the fs.defaultFS
(equivalent to fs.default.name
) set to the following:
<property>
<name>fs.defaultFS</name>
<value>hdfs://hadoop:8020</value>
</property>
I have a very simple table called test_table
that currently exists in the Hive server on the HDFS. That is, it is stored under /user/hive/warehouse/test_table
. It was created using a very simple command in Hive:
CREATE TABLE new_table (record_id INT);
If I attempt to load data into the table locally (that is, using LOAD DATA LOCAL
), everything proceeds as expected. However, if the data is stored on the HDFS and I want to load from there, an issue occurs.
I run a very simple query to attempt this load:
hive> LOAD DATA INPATH '/user/haduser/test_table.csv' INTO TABLE test_table;
Doing so leads to the following error:
FAILED: SemanticException [Error 10028]: Line 1:17 Path is not legal ''/user/haduser/test_table.csv'':
Move from: hdfs://hadoop:8020/user/haduser/test_table.csv to: hdfs://localhost:8020/user/hive/warehouse/test_table is not valid.
Please check that values for params "default.fs.name" and "hive.metastore.warehouse.dir" do not conflict.
As the error states, it is attempting to move from hdfs://hadoop:8020/user/haduser/test_table.csv
to hdfs://localhost:8020/user/hive/warehouse/test_table
. The first path is correct because it references hadoop:8020
; the second path is incorrect, because it references localhost:8020
.
The core-site.xml
file clearly states to use hdfs://hadoop:8020
. The hive.metastore.warehouse
value in hive-site.xml
correctly points to /user/hive/warehouse
. Thus, I doubt this error message has any true value.
How can I get the Hive server to use the correct NameNode address when creating tables?
Upvotes: 1
Views: 10048
Reputation: 1577
I found that the Hive metastore tracks the location of each table. You can see the that location be running the following in the Hive console.
hive> DESCRIBE EXTENDED test_table;
Thus, this issue occurs if the NameNode in core-site.xml
was changed while the metastore service was still running. Therefore, to resolve this issue the service should be restarted on that machine:
$ sudo service hive-metastore restart
Then, the metastore will use the new fs.defaultFS
for newly created tables such.
The location for tables that already exist can be corrected by running the following set of commands. These were obtained from Cloudera documentation to configure the Hive metastore to use High-Availability.
$ /usr/lib/hive/bin/metatool -listFSRoot
...
Listing FS Roots..
hdfs://localhost:8020/user/hive/warehouse
hdfs://localhost:8020/user/hive/warehouse/test.db
Correcting the NameNode location:
$ /usr/lib/hive/bin/metatool -updateLocation hdfs://hadoop:8020 hdfs://localhost:8020
Now the listed NameNode is correct.
$ /usr/lib/hive/bin/metatool -listFSRoot
...
Listing FS Roots..
hdfs://hadoop:8020/user/hive/warehouse
hdfs://hadoop:8020/user/hive/warehouse/test.db
Upvotes: 5