Reputation: 1693
I have a service that uploads images to AWS S3 using a MultipartFile. These images are later served as public files. There is a security concern: these images might contain sensitive EXIF metadata (e.g., geolocation data) that has to be removed out before making them public.
Problem: I need to strip the EXIF metadata from these images without loading the entire file into memory, as some of the images could be quite large.
My current approach:
private S3Service.S3UploadedFile uploadImage(MultipartFile file) {
try {
ByteArrayOutputStream originalOut = stripMetadata(file.getInputStream());
final PipedInputStream in = new PipedInputStream();
new Thread(() -> {
try (final PipedOutputStream newOut = new PipedOutputStream(in)) {
originalOut.writeTo(newOut);
} catch (IOException e) {
// logging and exception handling should go here
}
}).start();
S3File processedS3File = S3File.builderOf(in, file.getContentType())
.isPublic(true)
.contentLength((long) originalOut.size())
.build();
return s3Service.upload(bucketName, processedS3File);
} catch (IOException | ImageWriteException | ImageReadException e) {
throw new RuntimeException("ERR");
}
}
public static ByteArrayOutputStream stripMetadata(InputStream imageInputStream)
throws IOException, ImageWriteException, ImageReadException {
ByteArrayOutputStream outputStream = new ByteArrayOutputStream();
ExifRewriter exifRewriter = new ExifRewriter();
exifRewriter.removeExifMetadata(imageInputStream, outputStream);
return outputStream;
}
The usage of Piped streams is based on this answer: https://stackoverflow.com/a/23874232/6157949
However, in the stripMetadata
method, I'm using the Apache Commons Imaging library to remove EXIF metadata. The problem is that it requires an OutputStream, and I'm currently using a ByteArrayOutputStream, which loads the entire image into memory.
What I Need Help With:
I need guidance on how to tweak this approach so that I can strip the EXIF metadata from the image and upload it to S3 without loading the entire file into memory.
Any help or suggestions would be greatly appreciated!
Upvotes: 3
Views: 267
Reputation: 5075
You should simply pass the PipedOutputStream
to the call exifRewriter.removeExifMetadata
instead of creating a buffer on the heap.
I rewrite your code (untested) to make clear, what I mean:
private S3Service.S3UploadedFile uploadImage(MultipartFile file) {
try {
ExifRewriter exifRewriter = new ExifRewriter();
final PipedInputStream in = new PipedInputStream();
new Thread(() -> {
try (final PipedOutputStream newOut = new PipedOutputStream(in)) {
exifRewriter.removeExifMetadata(file.getInputStream(), newOut);
} catch (IOException e) {
// logging and exception handling should go here
}
}).start();
S3File processedS3File = S3File.builderOf(in, file.getContentType())
.isPublic(true)
.contentLength((long) originalOut.size())
.build();
return s3Service.upload(bucketName, processedS3File);
} catch (IOException | ImageWriteException | ImageReadException e) {
throw new RuntimeException("ERR");
}
}
Upvotes: 0