user3313379
user3313379

Reputation: 489

Aws get S3 Bucket Size using java api

I have searched on google about efficient way to get metadata about S3 bucket like its size and number of files in it. I found this link discussing such problem. But it's for PHP and aws cli using cloud-watch. I want to know is there some java api to fetch the s3 bucket metadata?

Thanks

Upvotes: 3

Views: 6212

Answers (4)

1vand1ng0
1vand1ng0

Reputation: 1202

You need the bucket name, but this solution relies on AWS CloudWatch since S3 does not directly support getting bucket size at the present time (Feb 2024).

With Amazon CloudWatch to get the stats of an s3 bucket (which includes size as a dimension) it would be something like this (needs some refactoring):

  package com.xyz.abc.config;

import com.amazonaws.services.cloudwatch.AmazonCloudWatch;
import com.amazonaws.services.cloudwatch.AmazonCloudWatchClientBuilder;
import com.amazonaws.services.cloudwatch.model.*;

import java.util.ArrayList;
import java.util.Collections;
import java.util.Date;
import java.util.List;

public class AwsCloudWatchMetricClient {
    public static final AmazonCloudWatch cw = AmazonCloudWatchClientBuilder.defaultClient();
    public static final String namespace = "AWS/S3";
    public static final String metricName = "BucketSizeBytes";

    public static double getAwsCloudWatchS3Metric(String bucketName) throws ResourceNotFoundException
    {
        double bucketStorageUsedInBytes = -1;

        final GetMetricDataRequest getMetricDataRequest = new GetMetricDataRequest();
        final MetricDataQuery metricDataQuery = new MetricDataQuery();
        metricDataQuery.setId("s3StorageMetric");
        final MetricStat metricStat = new MetricStat();
        final Metric metric = new Metric();
        metric.setNamespace(namespace);

        final Dimension dimensionStorageType = new Dimension();
        dimensionStorageType.setName("StorageType");
        dimensionStorageType.setValue("StandardStorage");

        final Dimension dimensionBucketName = new Dimension();
        dimensionBucketName.setName("BucketName");
        dimensionBucketName.setValue(bucketName);

        List<Dimension> dimensions = new ArrayList<>();
        dimensions.add(dimensionBucketName);
        dimensions.add(dimensionStorageType);

        metric.setDimensions(dimensions);
        metric.setMetricName(metricName);
        metricStat.setMetric(metric);
        metricStat.setPeriod(86400);
        metricStat.setStat("Maximum");
        metricDataQuery.setMetricStat(metricStat);
        getMetricDataRequest.setMetricDataQueries(Collections.singletonList(metricDataQuery));
        final Date endTime = new Date();
        getMetricDataRequest.setEndTime(endTime);
        getMetricDataRequest.setStartTime(new Date(endTime.getTime() - (3600*100000)));


        GetMetricDataResult response = cw.getMetricData(getMetricDataRequest);

        boolean done = false;


        while(!done) {
//            ListMetricsResult response = cw.listMetrics(request);

            for(MetricDataResult metricGathered : response.getMetricDataResults() ) {
                System.out.printf(
                        "Retrieved metric data is %s", metricGathered.getValues());
                System.out.println(metricGathered.toString());
                System.out.println(metricGathered.getMessages());

                List<Double> vals = metricGathered.getValues();

                if(!vals.isEmpty()) {
                    bucketStorageUsedInBytes = vals.get(vals.size() - 1);
                    return bucketStorageUsedInBytes;
                }else {
                    throw new ResourceNotFoundException("Problem adquiring S3 Maximum storage values for this bucket.");
                }

            }

            response.setNextToken(response.getNextToken());

            if(response.getNextToken() == null) {
                done = true;
            }
        }

        return bucketStorageUsedInBytes;
    }

}

Upvotes: 0

imxo
imxo

Reputation: 11

You can use the MinioAdminClient and its getDataUsageInfo() method to get all the info you need.

It can be connected via this link and configured in the same way as MinioClient by passing parameters with the help of a builder instead of a constructor.

@Bean
public MinioClient minioClient(
        @Value("${aws.endPoint}") String endPoint,
        @Value("${aws.accessKey}") String accessKey,
        @Value("${aws.secretKey}") String secretKey) throws InvalidPortException, InvalidEndpointException {

    return new MinioClient(endPoint, accessKey, secretKey);
}

@Bean
public MinioAdminClient minioAdminClient(
        @Value("${aws.endPoint}") String endPoint,
        @Value("${aws.accessKey}") String accessKey,
        @Value("${aws.secretKey}") String secretKey){

    return MinioAdminClient
            .builder()
            .endpoint(endPoint)
            .credentials(accessKey, secretKey)
            .build();
}

Link to the file in GitHub Repo: https://github.com/minio/minio-java/blob/master/adminapi/src/main/java/io/minio/admin/MinioAdminClient.java#L593

Upvotes: 0

Chandra Duddukuri
Chandra Duddukuri

Reputation: 57

with awssdk Java2.x

    Set<String> fileTypes = new HashSet<>();
    ListObjectsResponse listObjResp = amazonS3Client.listObjects(ListObjectsRequest.builder().bucket(bucketName).build());
    int iCount=1;
    //********************************************************************//
    log.info("listObjResp.isTruncated() : "+listObjResp.isTruncated());
    String nextMarker = null;
    do {
        String sKey = null;
        List<S3Object> s3ObjList = listObjResp.contents();
        for (S3Object s3Obj: s3ObjList) {
            sKey = s3Obj.key();
            String[] sKeyValues = sKey.split("\\.");
            if(sKeyValues.length==2) {
                fileTypes.add(sKeyValues[1]);
            }else {
                fileTypes.add(NO_FILE_EXT);
            }
            ++iCount;
        }
        nextMarker = listObjResp.nextMarker();
        log.debug("listObjResp.nextMarker() : "+nextMarker);
        listObjResp = amazonS3Client.listObjects(ListObjectsRequest.builder().bucket(bucketName).marker(nextMarker).build());
    } while (nextMarker !=null);
    
    log.info("iCount of '"+bucketName+"': "+(iCount-1));

Upvotes: 1

Istvan
Istvan

Reputation: 8572

You can find the extensive documentation of the AWS S3 Java library here:

http://docs.aws.amazon.com/AWSJavaSDK/latest/javadoc/overview-summary.html

Answering your question, you can use getSize() for getting the size of an object in S3 and you can iterate over all of your files to get the size of your bucket.

http://docs.aws.amazon.com/AWSJavaSDK/latest/javadoc/com/amazonaws/services/s3/model/S3ObjectSummary.html#getSize()

S3 does not support gathering disk usage directly (meaning not iterating through all of the items) but you can use CloudWatch to get the data you would like to use with a single request.

Example query:

aws cloudwatch get-metric-statistics --namespace AWS/S3 --start-time 2016-01-01T10:00:00 --end-time 2016-02-12T01:00:00 --period 86400 --statistics Average --region us-east-1 --metric-name BucketSizeBytes --dimensions Name=BucketName,Value=www.streambrightdata.com Name=StorageType,Value=StandardStorage

Returns:

{
    "Datapoints": [
        {
            "Timestamp": "2016-02-05T10:00:00Z",
            "Average": 54027423.0,
            "Unit": "Bytes"
        },
        {
            "Timestamp": "2016-02-03T10:00:00Z",
            "Average": 52917504.0,
            "Unit": "Bytes"
        },
        {
            "Timestamp": "2016-02-04T10:00:00Z",
            "Average": 53417421.0,
            "Unit": "Bytes"
        },
        {
            "Timestamp": "2016-02-07T10:00:00Z",
            "Average": 54949563.0,
            "Unit": "Bytes"
        },
        {
            "Timestamp": "2016-02-01T10:00:00Z",
            "Average": 24951965.0,
            "Unit": "Bytes"
        },
        {
            "Timestamp": "2016-02-02T10:00:00Z",
            "Average": 28254636.0,
            "Unit": "Bytes"
        },
        {
            "Timestamp": "2016-02-06T10:00:00Z",
            "Average": 54577328.0,
            "Unit": "Bytes"
        }
    ],
    "Label": "BucketSizeBytes"
}

AWS Java SDK for CloudWatch:

http://docs.aws.amazon.com/AWSJavaSDK/latest/javadoc/com/amazonaws/services/cloudwatch/AmazonCloudWatchClient.html

Upvotes: 3

Related Questions