Reputation: 186
create table MY_DATA0(session_id STRING, userid BIGINT,date_time STRING, ip STRING, URL STRING ,country STRING, state STRING, city STRING)
ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' LINES
TERMINATED BY '\n' STORED AS TEXTFILE ;
LOAD DATA INPATH '/inputhive' OVERWRITE INTO TABLE MY_DATA0;
create table part0(session_id STRING, userid BIGINT,date_time STRING, ip STRING, URL STRING) partitioned by (country STRING, state STRING, city STRING)
clustered by (userid) into 256 buckets ROW FORMAT DELIMITED FIELDS
TERMINATED BY ',' LINES TERMINATED BY '\n' STORED AS TEXTFILE ;
\insert overwrite table part0 partition(country, state, city) select session_id, userid, date_time,ip, url, country, state,city from my_data0;
Overview of my dataset:
{60A191CB-B3CA-496E-B33B-0ACA551DD503},1331582487,2012-03-12 13:01:27,66.91.193.75,http://www.acme.com/SH55126545/VD55179433,United States,Hauula,Hawaii
{365CC356-7822-8A42-51D2-B6396F8FC5BF},1331584835,2012-03-12 13:40:35,173.172.214.24,http://www.acme.com/SH55126545/VD55179433,United States,El Paso,Texas
When I run the last insert script I get an error as :
java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveFatalException: [Error 20004]: Fatal error occurred when node tried to create too many dynamic partitions. The maximum number of dynamic partitions is controlled by hive.exec.max.dynamic.partitions and hive.exec.max.dynamic.partitions.pernode. Maximum was set to: 100
PS:
I have set this two properties:
hive.exec.dynamic.partition.mode::nonstrict
hive.enforce.bucketing::true
Upvotes: 4
Views: 18568
Reputation: 1
It may be because your query is picking the wrong (or HIGH CARDINALITY) column (as that is placed at last when you do select * from table2). To specify use inset into table table1 partition(partition_column) select column_name1, column_name2, partition_column (keep the partitioned column at last) from table2; Refer to images earlier I was using select * from the table so was getting 1200 partitions but manually placing state column at last I was able to get only 38 partitions
Upvotes: 0
Reputation: 31
Partition columns should be mentioned at last in select statement. Ex: if state is the partition column, then "insert into table t1 partition(state) select Id, name, dept, sal, state from t2"; this will work. For instance if my query is like this "insert into table t1 partition(state) select Id, name, dept,state, sal from t2;" then partitions will be created with salary(sal) column
Upvotes: 3
Reputation: 794
Try setting those properties to higher values.
SET hive.exec.max.dynamic.partitions=100000;
SET hive.exec.max.dynamic.partitions.pernode=100000;
Upvotes: 15