Syncing remote folders from several machines to one AWS instance

Question

I have 3 AWS P instances processing some heavy stuff and saving results to relevant /home/user/folder
Also I have a main server with the same folder where I want to collect results from those 3 instances
Each instance works on its own part of the whole task, their results in sub folders not overlapping

Instances are 2 TB each, so I would like to get results from each instance as soon as they appear
This way when its job is done, I won't spend half a day copying results to the main server

I think one way of solving this is running something like this on each instance:

*/30 * * * * rsync /home/user/folder ubuntu@1.1.1.1:/home/user/folder

Are there any other more smart ways of achieving same results given that all of instances are AWS?
I also thought about (1) detachable storage and (2) storing on S3 but being new to AWS I might overlook some hidden pitfalls in such workflows, especially when it comes to terabytes of data and expensive instances.

How do you collect processed data from remote instances?

Roman Shishkin · Accepted Answer

I would consider using rclone tool, which can be easy configured for the shared S3 bucket. Just be aware about copy/sync mode. It can rich up to several Gigabit throughput depending on your instance type.

Link for the project: rclone.org

Syncing remote folders from several machines to one AWS instance

Answers (2)

Related Questions