Reputation: 443
I have a camera that adds new files to my AWS S3 bucket every hour, except when it doesn't. For rapid trouble-shooting, I'd like to be able to find (either list or view) the most recent file in the S3 folder. Or list all of the files since a particular date/time. FWIW, the file names are made up of UNIX epoch date stamps, so I could look for file names that contain a number bigger than, say 161315000.
The only solution I have so far is making a listing of all of the files, piped to a text file, which I can then parse. This takes too long...I have tens of thousands of files.
I'd be happy to use AWS CLI, s3cmd, Boto... whatever works.
Upvotes: 3
Views: 10925
Reputation: 269284
Rather than using the filename ("Key"), you could simply use the LastModified
date that S3 automatically attaches when an object is created.
To list the most-recent object based on this date, you could use:
aws s3api list-objects --bucket my-bucket --query 'sort_by(Contents, &LastModified)[-1].Key' --output text
To list objects since a given date (in UTC timezone, I suspect):
aws s3api list-objects --bucket my-bucket --query "Contents[?LastModified>='2021-01-29'].[Key]" --output text
If you wish to do it via Python, you will need to retrieve a list of ALL objects, then you could parse either the object key or LastModified date.
Upvotes: 9
Reputation: 8435
That's something you can't do with S3 alone, as S3 isn't a file system, but an object store. As such it's optimized for large amounts of objects, not for rapid listing.
If you have control over the format of the object keys, you could prefix them with the current date (like 2021/02/11/161315000
). That'd make it easy to find the latest object if you're looking for them only manually for debugging purposes.
If changing the format of the object keys isn't an option, you have to resort to more complex options.
While there exist S3 inventory reports, which do provide a listing of all objects and their last modified time, that's probably something which doesn't work for you either, as those reports get only generated once per day and might not include recently added objects.
An alternative, which might fit better for your use case, would be to utilize S3 event notifications for newly created objects to trigger an AWS Lambda function. This AWS Lambda function could then store the S3 key of the last modified object somewhere (like logging it to Amazon CloudWatch where you could simply check the latest log records for the most recently created S3 object).
Upvotes: 4