serverliving.com
serverliving.com

Reputation: 457

Hive Dynamic partition issue

I have 2 tables as below.

CREATE EXTERNAL TABLE IF NOT EXISTS TEMP_tab(id int,mytime STRING,age int)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ','
STORED AS TEXTFILE
LOCATION 'hdfs://xxx';

CREATE TABLE IF NOT EXISTS main_TAB(id int,age int)
PARTITIONED BY (mytime STRING)
STORED AS ORC
tblproperties ("orc.compress"="ZLIB");

FROM TEMP_TAB INSERT OVERWRITE TABLE main_TAB 
PARTITION (mytime) 
SELECT *,substr(mytime,0,10) as mytime;

but the strange thing is Insert does not work. It has following error message

Error: org.apache.spark.sql.AnalysisException: Cannot insert into table m16.main_TAB because the number of columns are different: need 2 columns, but query has 3 columns.; (state=,code=0)

I have already set these 2 as well

SET hive.exec.dynamic.partition = true
SET hive.exec.dynamic.partition.mode = nonstrict

Upvotes: 1

Views: 514

Answers (1)

OneCricketeer
OneCricketeer

Reputation: 191681

Now that all the typos are fixed, it's easier to see what you're doing.

That last line, you've selected mytime twice. I'm not sure why the error doesn't think you have 4 columns, but anyway, don't use the * since you have a partition column of the same name as the column in the source table.

SELECT id, age, substr(mytime,0,10) as mytime;

And when inserting into partitioned tables, I've noticed that the partition column(s) always need to be selected last

Upvotes: 1

Related Questions