YogeshR
YogeshR

Reputation: 1736

Combining Bash command with AWS CLI copy command

I need to copy some files from Linux machine to the S3 bucket. I need to copy only selected files. I am able to get files using below Bash command:

ls -1t /var/lib/pgsql/backups/full/backup_daily/test* | tail -n +8

Now, I want to combine this bash command with AWS S3 cp command. I searched and find below solution but it's not working.

ls -1t /var/lib/pgsql/backups/full/backup_daily/test* | tail -n +8  | aws s3 cp - s3://all-postgresql-backup/dev/

How can I make this work?

Upvotes: 3

Views: 4212

Answers (2)

Rajesh Chamarthi
Rajesh Chamarthi

Reputation: 18818

You might also want to take a look at S3 sync and s3 copy with --exclude commands.

aws s3 sync . s3://mybucket --exclude "*.jpg"

You could have a simple cron job that runs in the background every few minutes and keeps the directories in sync.

Syncs directories and S3 prefixes. Recursively copies new and updated files from the source directory to the destination. Only creates folders in the destination if they contain one or more files.

https://docs.aws.amazon.com/cli/latest/reference/s3/sync.html

Upvotes: 1

Charles Duffy
Charles Duffy

Reputation: 295619

If you're on a platform with GNU tools (find, sort, tail, sed), and you want to insert all the names in the position where you have the -, doing this reliably (in a manner robust against unexpected filenames) might look like:

find /var/lib/pgsql/backups/full/daily_backup -name 'guest*' -type f -printf '%T@ %p\0' |
  sort -znr |
  tail -z -n +8 |
  sed -zEe 's/[^ ]+ //' |
  xargs -0 sh -c 'aws s3 cp "$@" s3://all-postgresql-backup/ncldevshore/' _

There's a lot there, so let's take it piece-by-piece:

  • ls does not generate output safe for programmatic use. Thus, we use find instead, with a -printf string that puth a timestamp (in UNIX epoch time, seconds since 1970) before each file, and terminates each entry with a NUL (a character which, unlike a newline, cannot exist in filenames on UNIX).
  • sort -z is a GNU extension which delimits input and output by NULs; -n specifies numeric sort (since the timestamps are numeric); -r reverses sort order.
  • sed -z is a GNU extension which, again, delimits records by NULs rather than newlines; here, we're stripping the timestamp off the records after sorting them.
  • xargs -0 ... tells xargs to read NUL-delimited records from stdin, and append them to the argument list of ..., splitting into multiple invocations whenever this would go over maximum command-line length.
  • sh -c '..."$@"...' _ runs a shell -- sh -- with a command that includes "$@", which expands to the list of arguments that shell was passed. _ is a placeholder for $0. xargs will place the names produced by the preceding pipeline after the _, becoming $1, $2, etc, such that they're placed on the aws command line in place of the "$@".

References:

  • BashFAQ #3 - How can I sort or compare files based on some metadata attribute (newest / oldest modification time, size, etc)?
  • ParsingLs - Why you shouldn't parse the output of ls
  • UsingFind - See the "Actions In Bulk" section for discussion of safety precautions necessary to use xargs without introducing bugs (which the above code follows, but other suggestions may not).

Upvotes: 3

Related Questions