user1050619
user1050619

Reputation: 20906

How to point to a single file with external table

Im trying to load hdfs data as external but get the following error.

The folder ml-100k has multiple datasets with different datasets, so I just need to load that particular file.

hive> create external table movie_ratings (movie_id int, user_id int, ratings int, field_4 int) location 'hdfs://hadoop-master:8020/user/hduser/gutenberg/ml-100k/u.data'
    > ;
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:hdfs://hadoop-master:8020/user/hduser/gutenberg/ml-100k/u.data is not a directory or unable to create one)

Upvotes: 2

Views: 2253

Answers (2)

54l3d
54l3d

Reputation: 3973

You cannot create a Hive table over a specific file, you need to give a directory. So you can create a subdirectory under ml-100k/ and use it like this :

create external table movie_ratings (movie_id int, user_id int, ratings int, field_4 int) location 'hdfs://hadoop-master:8020/user/hduser/gutenberg/ml-100k/new_subfilder/'

The bug mentioned by @Dudu may solve a specific case, but its not safe for general use, because inserting into such table will create new files and will never append the specified one !

Upvotes: 0

David דודו Markovitz
David דודו Markovitz

Reputation: 44991

You cannot create a table that points to a file, only to a directory, but there is a feature/bug that allows you to alter the location to a specific file.

create external table movie_ratings (movie_id int, user_id int, ratings int, field_4 int) location 'hdfs://hadoop-master:8020/user/hduser/gutenberg/ml-100k';

alter table movie_ratings set location 'hdfs://hadoop-master:8020/user/hduser/gutenberg/ml-100k/u.data';

Upvotes: 4

Related Questions