RayChen
RayChen

Reputation: 1468

How to use AWS Java SDK to fetch objects list with query as s3api?

I'm developing with AWS java SDK. I would like to get an object list with a filter like filter by last modified date. I can see this feature on s3api as below

aws s3api list-objects 
            --bucket "myS3-BucketName" 
            --query "Contents[?LastModified>=`2018-02-01`].{Key: Key, Size: Size, LastModified: LastModified}" 
            --max-items 10"

I can't find a similar solution in Java SDK. How can I get this work with Java SDK?

Upvotes: 4

Views: 3215

Answers (1)

Jacob G.
Jacob G.

Reputation: 29720

Using v2 of the AWS SDK for Java, I created the following utility method:

/**
 * Gets S3 objects that reside in a specific bucket and whose keys conform to the 
 * specified prefix using v2 of the AWS Java SDK.
 * <br><br>
 * The objects returned will have a last-modified date between {@code start} and 
 * {@code end}.
 * <br><br>
 * Any objects that have been modified outside of the specified date-time range will 
 * not be returned.
 *
 * @param s3Client The v2 AWS S3 client used to make the request to S3.
 * @param bucket   The bucket where the S3 objects are located.
 * @param prefix   The common prefix that the keys of the S3 objects must conform to.
 * @param start    The objects returned will have been modified after this instant.
 * @param end      The objects returned will have been modified before this instant.
 * @return A {@link Stream} of {@link S3Object} objects.
 */
public static Stream<S3Object> getObjects(S3Client s3Client, String bucket, 
                                          String prefix, Instant start, 
                                          Instant end) {
    return s3Client.listObjectsV2Paginator(builder -> builder.bucket(bucket)
                   .prefix(prefix).build())
            .stream()
            .map(ListObjectsV2Response::contents)
            .flatMap(List::stream)
            .filter(s3Object -> {
                Instant lastModified = s3Object.lastModified();
                return !start.isAfter(lastModified) && !end.isBefore(lastModified);
            });
}

The following code is logically equivalent to your example:

S3Client s3Client = S3Client.create();
String bucket = "myS3-BucketName";
Instant before = Instant.parse("2018-02-01T00:00:00Z");
Instant after = Instant.MAX;

Stream<S3Object> firstTenObjects = 
    getObjects(s3Client, bucket, "", before, after).limit(10);

You can use the following methods to get the data you're looking for from each S3Object in the Stream:

Upvotes: 3

Related Questions