Reputation: 1003
I have data with amount of 2 millions needed to insert into postgresql. But it has played an low performance. Can I achieve a high-performance inserter by split the large transaction into smaller ones (Actually, I don't want to do this)? or, there is any other wise solutions?
Upvotes: 4
Views: 12562
Reputation: 16025
Possible methods to improve performance:
COPY
command.isolation level
for the transaction if your data can deal with the consequences.nobarrier
flag for the ext4
file system) and/or fsync
on the PostgreSQL server. Warning: this is usually unsafe but will improve your performance a lot.Splitting the insert job into series of smaller transaction will help only if you have to retry the transaction because of data dependency issues with paraller transactions. If the transaction succeeds on the first try, splitting it into several smaller transactions run in sequence will only decrease your performance.
Upvotes: 7
Reputation: 43
In my experience you CAN improve INSERT time-to-completion by splitting a large transaction into smaller ones, but only if the table you are inserting to has NO indexes or constraints applied, and NO default field values that would have to contend for a shared resource under multiple concurrent transactions. In that case, splitting the insert into several distinct parts and submitting each concurrently as separate processes will complete the job in significantly less time.
Upvotes: 2
Reputation: 28644
No, the main idea to have it much faster is doing all inserts in one transaction. Multiple transactions, or using no transaction, is much slower.
And try to use copy, which is even faster: http://www.postgresql.org/docs/9.1/static/sql-copy.html
If you really have to use inserts, you can also try dropping all indexes on this table, and creating them after loading the data.
This can be interesting as well: http://www.postgresql.org/docs/9.1/static/populate.html
Upvotes: 5