Naman Bhargava
Naman Bhargava

Reputation: 59

Loading data into hive table with dynamic partitioning

File empdetails.log has below data-

100 AAA 12000 HYD

101 BBB 13000 PUNE

102 CCC 14000 HYD

103 DDD 10000 BLORE

104 EEE 12000 PUNE

I want to load this data into a 'Emp' table with dynamic partitioning such that select * from Emp; gives me following output(partitioned by location).

100 AAA 12000 HYD

102 CCC 14000 HYD

101 BBB 13000 PUNE

104 EEE 12000 PUNE

103 DDD 10000 BLORE

Could anyone provide the load command to be executed in hive.

TABLE CREATED- create table Emp (cid int, cname string, csal int) partitioned by (cloc string) row format delimited fields terminated by '\t' stored as textfile;

Upvotes: 1

Views: 10973

Answers (1)

Nikhil Bhide
Nikhil Bhide

Reputation: 728

For dynamic partitioning, you have to use INSERT ... SELECT query (Hive insert).

Inserting data into Hive table having DP, is a two step process.

  1. Create staging table in staging database in hive and load data into that table from external source such as RDBMS, document database or local files using Hive load.
  2. Insert data into actual table into ODS (operational data store/final database) using Hive insert.

Also, set following properties in Hive.

  • SET hive.exec.dynamic.partition=true;
  • SET hive.exec.dynamic.partition.mode=nonstrict;

Following example works on cloudera VM.

-- Extract orders data from mysql (Retail_DB.products) 
select * from orders into outfile '/tmp/orders_data.psv' fieldsterminated by '|' lines terminated by 'n';

-- Create Hive table with DP - order_month is DP.
CREATE TABLE orders (order_id int, order_date string, order_customer_id int, order_status string ) PARTITIONED BY (order_month string) ROW FORMAT DELIMITED FIELDS TERMINATED BY '|'STORED AS TEXTFILE;

--Create staging table in Hive.
CREATE TABLE orders_stage (order_id int,order_date string, order_customer_id int, order_status string ) ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' STORED AS TEXTFILE;

--Load data into staging table (Hive)
Load data into staging table  load data local inpath 
/tmp/orders_data.psv' overwrite into table orders_stage;

--Insert into Orders, which is final table (Hive). 
Insert overwrite table retail_ods.orders partition (order_month)
select order_id, order_date, order_customer_id,order_status,
substr(order_date, 1, 7) order_month from retail_stage.orders_stage;

You can find more details at https://cwiki.apache.org/confluence/display/Hive/DynamicPartitions

Upvotes: 3

Related Questions