Reputation: 3975
I have customer files uploaded to Amazon S3, and I would like to add a feature to count the size of those files for each customer. Is there a way to "peek" into the file size without downloading them? I know you can view from the Amazon control panel but I need to do it pro grammatically.
Upvotes: 112
Views: 153498
Reputation: 629
AWS Java SDK v2.x version
Use dependency
<dependency>
<groupId>software.amazon.awssdk</groupId>
<artifactId>s3</artifactId>
<version>[2.27.2, 3.0.0)</version>
</dependency>
You can use the s3Client's headObject api as following,
public Long fetchObjectContentLength(String key) {
HeadObjectResponse headObjectResponse =
s3Client.headObject(builder -> builder.bucket(bucket).key(key));
return Objects.nonNull(headObjectResponse) ? headObjectResponse.contentLength() : 0L;
}
Upvotes: 0
Reputation: 14621
This is a solution for whoever is using Java and the S3 Java library provided by Amazon.
If you are using com.amazonaws.services.s3.AmazonS3
you can use a GetObjectMetadataRequest
request which allows you to query the object length.
The libraries you have to use are:
<!-- https://mvnrepository.com/artifact/com.amazonaws/aws-java-sdk-s3 -->
<dependency>
<groupId>com.amazonaws</groupId>
<artifactId>aws-java-sdk-s3</artifactId>
<version>1.11.511</version>
</dependency>
Imports:
import com.amazonaws.services.s3.AmazonS3;
import com.amazonaws.services.s3.AmazonS3ClientBuilder;
import com.amazonaws.services.s3.model.*;
And the code you need to get the content length:
GetObjectMetadataRequest metadataRequest = new GetObjectMetadataRequest(bucketName, fileName);
final ObjectMetadata objectMetadata = s3Client.getObjectMetadata(metadataRequest);
long contentLength = objectMetadata.getContentLength();
Before you can execute the code above, you will need to build the S3 client. Here is some example code for that:
AWSCredentials credentials = new BasicAWSCredentials(
accessKey,
secretKey
);
s3Client = AmazonS3ClientBuilder.standard()
.withRegion(clientRegion)
.withCredentials(new AWSStaticCredentialsProvider(credentials))
.build();
Upvotes: 11
Reputation: 11
This is how I did it in Java AWS SDK v2.x
Hope this helps.
Region region = Region.EU_CENTRAL_1;
S3Client s3client = S3Client.builder().region(region).build();
String bucket = "s3-demo";
HeadObjectRequest headObjectRequest = HeadObjectRequest.builder()
.bucket(bucket)
.key(fileName)
.build();
HeadObjectResponse headObjectResponse = s3client.headObject(headObjectRequest);
fileSize = headObjectResponse.contentLength();
Upvotes: 1
Reputation: 2118
If you are looking to do this with a single file, you can use aws s3api head-object
to get the metadata only without downloading the file itself:
$ aws s3api head-object --bucket mybucket --key path/to/myfile.csv --query "ContentLength"
Explanation
s3api head-object
retrieves the object metadata in json format--query "ContentLength"
filters the json response to get the size of the body in bytesUpvotes: 6
Reputation: 1171
Golang example, same principle, run head request again the object in question:
func returnKeySizeInMB(bucketName string, key string) {
output, err := svc.HeadObject(
&s3.HeadObjectInput{
Bucket: aws.String(bucketName),
Key: aws.String(key),
})
if err != nil {
log.Fatalf("Unable to to send head request to item %q, %v", e.Detail.RequestParameters.Key, err)
}
return int(*output.ContentLength / 1024 / 1024)
}
Here, the parameter key
means the path to the file.
For eg, if the URI of the file is S3://my-personal-bucket/folder1/subfolder1/myfile.pdf
, then the syntax would look like:
output, err := svc.HeadObject(
&s3.HeadObjectInput{
Bucket: aws.String("my-personal-bucket"),
Key: aws.String("folder1/subfolder1/myfile.pdf"),
})
Upvotes: 0
Reputation: 238797
These days you could also use Amazon S3 Inventory which gives you:
Size – The object size in bytes.
Upvotes: 0
Reputation: 121
If the file is a private one, we can get the header by SDK.
PHP example:
$head = $client->headObject(
[
'Bucket' => $bucket,
'Key' => $key,
]
);
$result = (int) ($head->get('ContentLength') ?? 0);
Upvotes: 1
Reputation: 501
Ruby solution with head_object:
require 'aws-sdk-s3'
s3 = Aws::S3::Client.new(
region: 'us-east-1', #or any other region
access_key_id: AWS_ACCESS_KEY_ID,
secret_access_key: AWS_SECRET_ACCESS_KEY
)
res = s3.head_object(bucket: bucket_name, key: object_key)
file_size = res[:content_length]
Upvotes: 1
Reputation: 11
Aws C++ solution to get file size
//! Step 1: create s3 client
Aws::S3::S3Client s3Client(cred, config); //!Used cred & config,You can use other options.
//! Step 2: Head Object request
Aws::S3::Model::HeadObjectRequest headObj;
headObj.SetBucket(bucket);
headObj.SetKey(key);
//! Step 3: read size from object header metadata
auto object = s3Client.HeadObject(headObj);
if (object.IsSuccess())
{
fileSize = object.GetResultWithOwnership().GetContentLength();
}
else
{
std::cout << "Head Object error: "
<< object .GetError().GetExceptionName() << " - "
<< object .GetError().GetMessage() << std::endl;
}
Note: Do not use GetObject to extract size, It reads file to extract information.
Upvotes: 1
Reputation: 791
.NET AWS SDK ---- ListObjectsRequest, ListObjectsResponse, S3Object
AmazonS3Client s3 = new AmazonS3Client();
SpaceUsed(s3, "putBucketNameHere");
static void SpaceUsed(AmazonS3Client s3Client, string bucketName)
{
ListObjectsRequest request = new ListObjectsRequest();
request.BucketName = bucketName;
ListObjectsResponse response = s3Client.ListObjects(request);
long totalSize = 0;
foreach (S3Object o in response.S3Objects)
{
totalSize += o.Size;
}
Console.WriteLine("Total Size of bucket " + bucketName + " is " +
Math.Round(totalSize / 1024.0 / 1024.0, 2) + " MB");
}
Upvotes: 5
Reputation: 8603
Node.js example:
const AWS = require('aws-sdk');
const s3 = new AWS.S3();
function sizeOf(key, bucket) {
return s3.headObject({ Key: key, Bucket: bucket })
.promise()
.then(res => res.ContentLength);
}
// A test
sizeOf('ahihi.mp4', 'output').then(size => console.log(size));
Doc is here.
Upvotes: 63
Reputation: 6393
You can simply use the s3 ls
command:
aws s3 ls s3://mybucket --recursive --human-readable --summarize
Outputs
2013-09-02 21:37:53 10 Bytes a.txt
2013-09-02 21:37:53 2.9 MiB foo.zip
2013-09-02 21:32:57 23 Bytes foo/bar/.baz/a
2013-09-02 21:32:58 41 Bytes foo/bar/.baz/b
2013-09-02 21:32:57 281 Bytes foo/bar/.baz/c
2013-09-02 21:32:57 73 Bytes foo/bar/.baz/d
2013-09-02 21:32:57 452 Bytes foo/bar/.baz/e
2013-09-02 21:32:57 896 Bytes foo/bar/.baz/hooks/bar
2013-09-02 21:32:57 189 Bytes foo/bar/.baz/hooks/foo
2013-09-02 21:32:57 398 Bytes z.txt
Total Objects: 10
Total Size: 2.9 MiB
Reference: https://docs.aws.amazon.com/cli/latest/reference/s3/ls.html
Upvotes: 68
Reputation: 1073
I do something like this in Python to get the cumulative size of all files under a given prefix:
import boto3
bucket = 'your-bucket-name'
prefix = 'some/s3/prefix/'
s3 = boto3.client('s3')
size = 0
result = s3.list_objects_v2(Bucket=bucket, Prefix=prefix)
size += sum([x['Size'] for x in result['Contents']])
while result['IsTruncated']:
result = s3.list_objects_v2(
Bucket=bucket, Prefix=prefix,
ContinuationToken=result['NextContinuationToken'])
size += sum([x['Size'] for x in result['Contents']])
print('Total size in MB: ' + str(size / (1000**2)))
Upvotes: 6
Reputation: 1621
The following python code will provide the size of top 1000 files printing them individually from s3:
import boto3
bucket = 'bucket_name'
prefix = 'prefix'
s3 = boto3.client('s3')
contents = s3.list_objects_v2(Bucket=bucket, MaxKeys=1000, Prefix=prefix)['Contents']
for c in contents:
print('Size (KB):', float(c['Size'])/1000)
Upvotes: 1
Reputation: 3444
There is better solution.
$info = $s3->getObjectInfo($yourbucketName, $yourfilename);
print $info['size'];
Upvotes: 1
Reputation: 327
Integrate aws sdk and you get a pretty much straight forward solution:
// ... put this in background thread
List<S3ObjectSummary> s3ObjectSummaries;
s3ObjectSummaries = s3.listObjects(registeredBucket).getObjectSummaries();
for (int i = 0; i < s3ObjectSummaries.size(); i++) {
S3ObjectSummary s3ObjectSummary = s3ObjectSummaries.get(i);
Log.d(TAG, "doInBackground: size " + s3ObjectSummary.getSize());
}
Upvotes: 1
Reputation: 5543
PHP code to check s3 object size (or any other object headers), notice the use stream_context_set_default to make sure it only uses a HEAD request
stream_context_set_default(
array(
'http' => array(
'method' => 'HEAD'
)
)
);
$headers = get_headers('http://s3.amazonaws.com/bucketname/filename.jpg', 1);
$headers = array_change_key_case($headers);
$size = trim($headers['content-length'],'"');
Upvotes: 0
Reputation: 1463
Using Michael's advice, my successful code looked like this:
require 'net/http'
require 'uri'
file_url = MyObject.first.file.url
url = URI.parse(file_url)
req = Net::HTTP::Head.new url.path
res = Net::HTTP.start(url.host, url.port) {|http|
http.request(req)
}
file_length = res["content-length"]
Upvotes: 7
Reputation: 6945
You can also do a listing of the contents of the bucket. The metadata in the listing contains the file sizes of all of the objects. This is how it's implemented in the AWS SDK for PHP.
Upvotes: 1
Reputation: 5266
Send an HTTP HEAD request to the object. A HEAD request will retrieve the same HTTP headers as a GET request, but it will not retrieve the body of the object (saving you bandwidth). You can then parse out the Content-Length header value from the HTTP response headers.
Upvotes: 85