Reputation: 20906
Im trying to load hdfs data as external but get the following error.
The folder ml-100k has multiple datasets with different datasets, so I just need to load that particular file.
hive> create external table movie_ratings (movie_id int, user_id int, ratings int, field_4 int) location 'hdfs://hadoop-master:8020/user/hduser/gutenberg/ml-100k/u.data'
> ;
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:hdfs://hadoop-master:8020/user/hduser/gutenberg/ml-100k/u.data is not a directory or unable to create one)
Upvotes: 2
Views: 2253
Reputation: 3973
You cannot create a Hive table over a specific file, you need to give a directory.
So you can create a subdirectory under ml-100k/
and use it like this :
create external table movie_ratings (movie_id int, user_id int, ratings int, field_4 int) location 'hdfs://hadoop-master:8020/user/hduser/gutenberg/ml-100k/new_subfilder/'
The bug mentioned by @Dudu may solve a specific case, but its not safe for general use, because inserting into such table will create new files and will never append the specified one !
Upvotes: 0
Reputation: 44991
You cannot create a table that points to a file, only to a directory, but there is a feature/bug that allows you to alter the location to a specific file.
create external table movie_ratings (movie_id int, user_id int, ratings int, field_4 int) location 'hdfs://hadoop-master:8020/user/hduser/gutenberg/ml-100k';
alter table movie_ratings set location 'hdfs://hadoop-master:8020/user/hduser/gutenberg/ml-100k/u.data';
Upvotes: 4