user3313379
user3313379

Reputation: 489

S3 hive external table on subdirectories is not working

I have following s3 directory structure.

Data/
   Year=2015/
         Month=01/
            Day=01/
                files
            Day=02/
                files
         Month=02/
            Day=01/
                files
            Day=02/
                files
         .
         .
         .

   Year=2014/
         Month=01/
            Day=01/
                files
            Day=02/
                files
         Month=02/
            Day=01/
                files
            Day=02/
                files

So i am creating hive external table as follow

CREATE external TABLE trips
(
 trip_id  STRING,probe_id STRING,provider_id STRING,
 is_moving TINYINT,is_completed BOOLEAN,start_time STRING,
 start_lat  DOUBLE,start_lon DOUBLE,start_lat_adj DOUBLE) 
  PARTITIONED BY (year INT,month INT,day INT)
  STORED AS TEXTFILE
  LOCATION 's3n://accesskey:secretkey@bucket/data/';

When i run query on this table no data is returned without any exception. If i place same files in one directory only and without partitioning, then it runs fine. I also tried bey setting

set mapred.input.dir.recursive=true;
set hive.mapred.supports.subdirectories=true;

Any idea where i am wrong?

Upvotes: 4

Views: 1515

Answers (1)

leftjoin
leftjoin

Reputation: 38290

You need to run ALTER TABLE trips RECOVER PARTITIONS command. This command will create metadata for table partitions which exist in S3. See docs here: https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-RecoverPartitions(MSCKREPAIRTABLE)

Upvotes: 1

Related Questions