Reputation: 1938
I have a use case where I programmatically bring up an EC2 instance, copy an executable file from S3, run it and shut down the instance (done in user-data). I need to get only the last added file from S3.
Is there a way to get the last modified file / object from a S3 bucket using the AWS CLI tool?
Upvotes: 159
Views: 200475
Reputation: 21
That is my shoot, it is a CLI helper which you just need to run using python, but you can implement the same logic in any other language, it just does the work you would do manually checking bucket by bucket automatically
import subprocess
# Define your bucket names here
buckets = ['development', 'human-resources', 'marketing']
last_backups = {}
# Print a message to ignore errors during the process
print("Ignore any errors during the process.")
# Prompt the user to choose between backup sizes and last backup dates
option = input("- Backup sizes - 1\n- Last backup dates - 2\n->> ")
command = ''
# Determine the appropriate command based on user input
if option == '1':
command = 'tail -2'
elif option == '2':
command = 'head -1'
# Retrieve backup information for each bucket
for bucket in buckets:
# Execute AWS command to get backup information
process = subprocess.Popen(
f"aws s3 ls s3://{bucket}/ --recursive --human-readable --summarize | {command}",
shell=True,
stdout=subprocess.PIPE,
stderr=subprocess.PIPE
)
# Read the output of the command
output, _ = process.communicate()
# Store the output in the dictionary
last_backups[bucket] = output.decode()
# Print the backup information for each bucket
for bucket, backup_info in last_backups.items():
# Print the last backup date
if option == '2':
print(f" - {bucket}: {backup_info[:20]}\n")
# Print the backup sizes and object counts
elif option == '1':
print(f" - {bucket}: {backup_info}\n")
Upvotes: 1
Reputation: 2605
After a while there is a small update how to do it a bit elegant:
aws s3api list-objects-v2 --bucket "my-awesome-bucket" --query 'sort_by(Contents, &LastModified)[-1].Key' --output=text
Instead of extra reverse
function we can get last entry from the list via [-1]
This command just do the job without any external dependencies:
aws s3api list-objects-v2 --bucket "my-awesome-bucket" --query 'reverse(sort_by(Contents, &LastModified))[:1].Key' --output=text
Upvotes: 78
Reputation: 111
aws s3api list-objects-v2 --bucket "bucket-name" |jq -c ".[] | max_by(.LastModified)|.Key"
Upvotes: 11
Reputation: 271
Following is bash script, that downloads latest file from a S3 Bucket. I used AWS S3 Synch command instead, so that it would not download the file from S3 if already existing.
--exclude, excludes all the files
--include, includes all the files matching the pattern
#!/usr/bin/env bash
BUCKET="s3://my-s3-bucket-eu-west-1/list/"
FILE_NAME=`aws s3 ls $BUCKET | sort | tail -n 1 | awk '{print $4}'`
TARGET_FILE_PATH=target/datdump/
TARGET_FILE=${TARGET_FILE_PATH}localData.json.gz
echo $FILE_NAME
echo $TARGET_FILE
aws s3 sync $BUCKET $TARGET_FILE_PATH --exclude "*" --include "*$FILE_NAME*"
cp target/datdump/$FILE_NAME $TARGET_FILE
p.s. Thanks @David Murray
Upvotes: -2
Reputation: 4893
You can list all the objects in the bucket with aws s3 ls $BUCKET --recursive
:
$ aws s3 ls $BUCKET --recursive
2015-05-05 15:36:17 4 an_object.txt
2015-06-08 14:14:44 16322599 some/other/object
2015-04-29 12:09:29 32768 yet-another-object.sh
They're sorted alphabetically by key, but that first column is the last modified time. A quick sort
will reorder them by date:
$ aws s3 ls $BUCKET --recursive | sort
2015-04-29 12:09:29 32768 yet-another-object.sh
2015-05-05 15:36:17 4 an_object.txt
2015-06-08 14:14:44 16322599 some/other/object
tail -n 1
selects the last row, and awk '{print $4}'
extracts the fourth column (the name of the object).
$ aws s3 ls $BUCKET --recursive | sort | tail -n 1 | awk '{print $4}'
some/other/object
Last but not least, drop that into aws s3 cp
to download the object:
$ KEY=`aws s3 ls $BUCKET --recursive | sort | tail -n 1 | awk '{print $4}'`
$ aws s3 cp s3://$BUCKET/$KEY ./latest-object
Upvotes: 332
Reputation: 1363
If this is a freshly uploaded file, you can use Lambda to execute a piece of code on the new S3 object.
If you really need to get the most recent one, you can name you files with the date first, sort by name, and take the first object.
Upvotes: 0