oikonomiyaki
oikonomiyaki

Reputation: 7951

Create an external Hive table from an existing external table

I have a set of CSV files in a HDFS path and I created an external Hive table, let's say table_A, from these files. Since some of the entries are redundant, I tried creating another Hive table based on table_A, say table_B, which has distinct records. I was able to create table_B as a non-external table (Hive warehouse). I want to know if I can create table_B as an external table? If this is possible, will it copy the records from table_A and create its own table_B store on a specified path (preferably as CSV also)?

Upvotes: 0

Views: 2184

Answers (1)

rbyndoor
rbyndoor

Reputation: 729

I am presuming you want to select distinct data from "uncleaned" table and insert into "cleaned" table.

CREATE EXTERNAL TABLE `uncleaned`(
  `a` int, 
  `b` string,
  `c` string, 
  `d` string, 
  `e` bigint
  ) 
ROW FORMAT DELIMITED 
  FIELDS TERMINATED BY ',' 
STORED AS INPUTFORMAT 
  'org.apache.hadoop.mapred.TextInputFormat' 
OUTPUTFORMAT 
  'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
LOCATION
  '/external/uncleaned'

create another table, it can be external or not(doesn't matter).

CREATE EXTERNAL TABLE `cleaned`(
  `a` int, 
  `b` string,
  `c` string, 
  `d` string, 
  `e` bigint
  ) 
ROW FORMAT DELIMITED 
  FIELDS TERMINATED BY ',' 
STORED AS INPUTFORMAT 
  'org.apache.hadoop.mapred.TextInputFormat' 
OUTPUTFORMAT 
  'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
LOCATION
  '/external/cleaned'

Read from first table and you can insert it by

insert overwrite table cleaned 

select distinct a,b,c,d,e from uncleaned;

Upvotes: 2

Related Questions