Reputation: 151
I am currently using athena to process and perform ETL - from which I get the csv file containing the entire data set to be loaded into the aurora rds tables. I have found that LOAD DATA FROM S3 to be an option to load the data. Since these files are very large around 10 GB with 4-5 millions rows of data. Can aurora handle such huge loads of data from the same file or will there be timeouts during this process. How can these process be made more efficient if necessary?
Upvotes: 0
Views: 1285
Reputation: 7669
You should consider using AWS Database Migration Service for this. Once you set up the migration, AWS DMS fully manages the work, and it will take care of any timeouts or failures that it might encounter.
AWS DMS allows you to use many sources (including S3) to load data into many targets (including Aurora).
AWS DMS can be done as a one-time task or as an initial load with ongoing data replication.
All data changes to the source database that occur during the migration are continuously replicated to the target.
(From AWS DMS Benefits)
Upvotes: 1