Reputation: 379
We are trying to move our databases (mysql) to amazon redshift (data warehouse), and we are facing a problem while updating the warehouse db. We have enabled general logging on our db and then we are replaying all these queries from general log on redshift. All the updates queries are taking around 6-7 secs. I am looking for some way to execute these updates at a faster speed? Amazon redshift internally uses a postgre db, and it'll be great if anyone who has resolved this problem for redshift/postgre itself can suggest a solution. Although a general approach to make updates faster would also be helpful. One solution that i have tried is merging all the updates into a set of deletes and inserts. So all the updates on a single table will transform into a single delete query with combined where clauses and a single batch insert query. Could anyone either provide with an alternative solution or comment on the solution that i tried?
Upvotes: 0
Views: 198
Reputation: 8647
Redshift is not intended to be used as a 'regular' database - all data should be inserted using a COPY command (or CREATE TABLE AS syntax) then use it for SELECTS.
Operations on a single rows (like inserting or updating) are not what this database is optimised for.
Proposed workaround (using DELETE/INSERT) instead of UPDATE will increase speed, but remeber that all added rows are inserted into UNSORTED chunk of database, only after VACUUM all data will be correctly sorted and unused space will be reclaimed to use.
Upvotes: 1