Reputation: 664
I am creating a table with aggregate functions on few columns and i don't have any joins. it has more than 7 Million rows and gave distributing by random. it is taking almost 4 to 5 hours.
can any one help how to optimize or best way to create these tables in netezza. unfortunately we don't have any hints to use.
Thanks in advance.
Upvotes: 1
Views: 3248
Reputation: 21
try create table table_name ( field1 bigint, field2 varchar(10))distribute on random
insert into table_name ()
generate statistics on table_name
the key is controlling your data types. if your recordset is large varchar fields, netezza is going to struggle for speed. if you can take your number fields and cast them as integer types, you will see better performance. All your varchar fields need to be "reasonable" (don't use varchar(5000) on every field.
another option:
perhaps the table statistics are not updated.
generate express statistics on table_name;
last option:
what is the distrubution key on the existing table? Perhaps that is driving your problem, in that the table has high skew which means you are not distributing across the box efficiently. If you own the table, you may want to rebuild with a better distribution key, keeping in mind that it will be slow, but the future performance of the table will be better.
hope that helps.
Upvotes: 2
Reputation: 1043
Have you tried CTAS (Create table as ...) option, with CTAS option data being distributes in dataslices only without involvement of host. So this is faster and easier way.
Upvotes: 0