iateadonut
iateadonut

Reputation: 2239

syncing files with aws s3 sync that have a minimum timestamp

I am syncing a directory to an s3 bucket. It is a directory, so I only want it to check for files that were created/updated in the last 24 hours.

With GNU/Linux's rsync, you might do this by piping the output of 'find -mtime' to rsync; I'm wondering if anything like this is possible with aws s3 sync?

Edited to show final goal: I'm running a script that constantly syncs files to S3 from a web server. It runs every minute, first checks if there is already a process running (and exits if it does), then runs the aws sync command. The sync command takes about 5 minutes to run and usually gets 3-5 new files. This causes a slight load on the system, and I think if I just checked for files in the last 24 hours, it'd be much much faster.

Upvotes: 0

Views: 4137

Answers (1)

John Rotenstein
John Rotenstein

Reputation: 269480

No, the AWS Command-Line Interface (CLI) aws s3 sync command does not have an option to only include files created within a defined time period.

See: aws s3 sync documentation

It sounds like most of your time is being consumed by the check of whether files need to be updated. Some options:

  • If you don't need all the files locally, you could delete them after some time (48 hours?). This means less files will need to be compared. By default, aws s3 sync will not delete destination files that do not match a local file (but this can be configured via a flag).
  • You could copy recent files (past 24 hours?) into a different directory and run aws s3 sync from that directory. Then, clear out those files after a successful sync run.
  • If you have flexibility over the filenames, you could include the date in the filename (eg 2018-03-13-foo.txt) and then use --include and --exclude parameters to only copy files with desired prefixes.

Upvotes: 1

Related Questions