Export data from a MariaDB RDS table to S3 - Data Pipeline failing

My goal is to export a large (~300GB) table to a csv/tsv in S3 for long term storage (basically, if someone WANTS to look at it in years to come, they can, but it is not required to be available online). I need to copy JUST THIS ONE TABLE, not the whole db.

After some quick googling, it seemed like Data Pipeline was the way to go. Sadly, the data pipelines are failing with No space left on device which seems to be because it doesn't support multipart uploads (https://forums.aws.amazon.com/message.jspa?messageID=730566)

AWS Support suggested we increase EBS root volume size, which i did by creating custom AMI to no avail.

I have also tried setting up my own linux machine with sqoop on it, but unfortunately that just hangs after a day or two running the query.

What's the easiest way to dump just one table from private RDS to S3?

Upvotes: 2

Answers (2)

Ravi Hirani

Reputation: 6539

We should avoid exporting and uploading to s3 in 2 steps.

For larger DB, I would suggest using the below command to upload file to S3 directly without occupying the local space

mysqldump -h [db_hostname] -u [db_user] -p[db_passwd] [databasename] | aws s3 cp - s3://[s3_bucketname]/[mysqldump_filename]

e.g.

mysqldump -h $HOST -u $USER -p $PASSWORD $DB_NAME --routines --single-transaction | gzip | aws s3 cp - s3://bucket/exportfile.sql.gz

Upvotes: 0

Deepak Singhal

Reputation: 10874

I will suggest you to start a Micro EC2 instance with huge EBS volume like 400 GB. Then write a simple shell script (2 lines of code ) to export (using mysqldump) each table and then use aws s3 cp command to copy it to S3. You need to be aware that if tables are getting updated realtime; then using this method can lead to loss of data storage on S3.

Start with smaller tables first to make sure it works. mysqldump works for really huge tables.

Upvotes: 1

Export data from a MariaDB RDS table to S3 - Data Pipeline failing

Answers (2)

Related Questions