Reputation: 1074
Upload a PDF file from a browser client without exposing any credentials or anything unsavory. Based on this, I thought it could be done, but it doesn't seem to work for me.
The premise is:
you request a pre-signed URL from an S3 Bucket based on a set of parameters supplied to a function that is part of the JavaScript AWS SDK
you supply this URL to the frontend, which can use it to place a file in the S3 Bucket without needing to use any credentials or authentication on the frontend.
This part is simple and it works for me. I just request a URL from S3 with this little JS nugget:
const s3Params = {
Bucket: uploadBucket,
Key: `${fileId}.pdf`,
ContentType: 'application/pdf',
Expires: 60,
ACL: 'public-read',
}
let uploadUrl = s3.getSignedUrl('putObject', s3Params);
This is the part that doesn't work, and I can't figure out why. This little chunk of code basically sends a blob of data to the S3 bucket pre-signed URL using a PUT request.
const result = await fetch(response.data.uploadURL, {
method: 'put',
body: blobData,
});
I've found that using any POST requests results in 400 Bad Request
, so PUT it is.
Content-Type (in my case, it'd be application/pdf, so blobData.type
) -- they match between the backend and frontend.
Similar use case. Looking at this one, it appears that no headers need to be supplied in the PUT request and the signed URL itself is all that is necessary for the file upload.
Something weird that I don't understand. It looks like I may need to pass the length and type of the file to the getSignedUrl
call to S3.
Exposing my Bucket to the public (no bueno)
...
uploadFile: async function(e) {
/* receives file from simple input element -> this.file */
// get signed URL
const response = await axios({
method: 'get',
url: API_GATEWAY_URL
});
console.log('upload file response:', response);
let binary = atob(this.file.split(',')[1]);
let array = [];
for (let i = 0; i < binary.length; i++) {
array.push(binary.charCodeAt(i));
}
let blobData = new Blob([new Uint8Array(array)], {type: 'application/pdf'});
console.log('uploading to:', response.data.uploadURL);
console.log('blob type sanity check:', blobData.type);
const result = await fetch(response.data.uploadURL, {
method: 'put',
headers: {
'Access-Control-Allow-Methods': '*',
'Access-Control-Allow-Origin': '*',
'x-amz-acl': 'public-read',
'Content-Type': blobData.type
},
body: blobData,
});
console.log('PUT result:', result);
this.uploadUrl = response.data.uploadURL.split('?')[0];
}
'use strict';
const uuidv4 = require('uuid/v4');
const aws = require('aws-sdk');
const s3 = new aws.S3();
const uploadBucket = 'the-chumiest-bucket';
const fileKeyPrefix = 'path/to/where/the/file/should/live/';
const getUploadUrl = async () => {
const fileId = uuidv4();
const s3Params = {
Bucket: uploadBucket,
Key: `${fileId}.pdf`,
ContentType: 'application/pdf',
Expires: 60,
ACL: 'public-read',
}
return new Promise((resolve, reject) => {
let uploadUrl = s3.getSignedUrl('putObject', s3Params);
resolve({
'statusCode': 200,
'isBase64Encoded': false,
'headers': {
'Access-Control-Allow-Origin': '*',
'Access-Control-Allow-Headers': '*',
'Access-Control-Allow-Credentials': true,
},
'body': JSON.stringify({
'uploadURL': uploadUrl,
'filename': `${fileId}.pdf`
})
});
});
};
exports.handler = async (event, context) => {
console.log('event:', event);
const result = await getUploadUrl();
console.log('result:', result);
return result;
}
service: ocr-space-service
provider:
name: aws
region: ca-central-1
stage: ${opt:stage, 'dev'}
timeout: 20
plugins:
- serverless-plugin-existing-s3
- serverless-step-functions
- serverless-pseudo-parameters
- serverless-plugin-include-dependencies
layers:
spaceOcrLayer:
package:
artifact: spaceOcrLayer.zip
allowedAccounts:
- "*"
functions:
fileReceiver:
handler: src/node/fileReceiver.handler
events:
- http:
path: /doc-parser/get-url
method: get
cors: true
startStateMachine:
handler: src/start_state_machine.lambda_handler
role:
runtime: python3.7
layers:
- {Ref: SpaceOcrLayerLambdaLayer}
events:
- existingS3:
bucket: ingenio-documents
events:
- s3:ObjectCreated:*
rules:
- prefix:
- suffix: .pdf
startOcrSpaceProcess:
handler: src/start_ocr_space.lambda_handler
role:
runtime: python3.7
layers:
- {Ref: SpaceOcrLayerLambdaLayer}
parseOcrSpaceOutput:
handler: src/parse_ocr_space_output.lambda_handler
role:
runtime: python3.7
layers:
- {Ref: SpaceOcrLayerLambdaLayer}
renamePdf:
handler: src/rename_pdf.lambda_handler
role:
runtime: python3.7
layers:
- {Ref: SpaceOcrLayerLambdaLayer}
parseCorpSearchOutput:
handler: src/node/pdfParser.handler
role:
runtime: nodejs10.x
saveFileToProcessed:
handler: src/node/saveFileToProcessed.handler
role:
runtime: nodejs10.x
stepFunctions:
stateMachines:
ocrSpaceStepFunc:
name: ocrSpaceStepFunc
definition:
StartAt: StartOcrSpaceProcess
States:
StartOcrSpaceProcess:
Type: Task
Resource: "arn:aws:lambda:#{AWS::Region}:#{AWS::AccountId}:function:#{AWS::StackName}-startOcrSpaceProcess"
Next: IsDocCorpSearchChoice
Catch:
- ErrorEquals: ["HandledError"]
Next: HandledErrorFallback
IsDocCorpSearchChoice:
Type: Choice
Choices:
- Variable: $.docIsCorpSearch
NumericEquals: 1
Next: ParseCorpSearchOutput
- Variable: $.docIsCorpSearch
NumericEquals: 0
Next: ParseOcrSpaceOutput
ParseCorpSearchOutput:
Type: Task
Resource: "arn:aws:lambda:#{AWS::Region}:#{AWS::AccountId}:function:#{AWS::StackName}-parseCorpSearchOutput"
Next: SaveFileToProcessed
Catch:
- ErrorEquals: ["SqsMessageError"]
Next: CorpSearchSqsErrorFallback
- ErrorEquals: ["DownloadFileError"]
Next: CorpSearchDownloadFileErrorFallback
- ErrorEquals: ["HandledError"]
Next: HandledNodeErrorFallback
SaveFileToProcessed:
Type: Task
Resource: "arn:aws:lambda:#{AWS::Region}:#{AWS::AccountId}:function:#{AWS::StackName}-saveFileToProcessed"
End: true
ParseOcrSpaceOutput:
Type: Task
Resource: "arn:aws:lambda:#{AWS::Region}:#{AWS::AccountId}:function:#{AWS::StackName}-parseOcrSpaceOutput"
Next: RenamePdf
Catch:
- ErrorEquals: ["HandledError"]
Next: HandledErrorFallback
RenamePdf:
Type: Task
Resource: "arn:aws:lambda:#{AWS::Region}:#{AWS::AccountId}:function:#{AWS::StackName}-renamePdf"
End: true
Catch:
- ErrorEquals: ["HandledError"]
Next: HandledErrorFallback
- ErrorEquals: ["AccessDeniedException"]
Next: AccessDeniedFallback
AccessDeniedFallback:
Type: Fail
Cause: "Access was denied for copying an S3 object"
HandledErrorFallback:
Type: Fail
Cause: "HandledError occurred"
CorpSearchSqsErrorFallback:
Type: Fail
Cause: "SQS Message send action resulted in error"
CorpSearchDownloadFileErrorFallback:
Type: Fail
Cause: "Downloading file from S3 resulted in error"
HandledNodeErrorFallback:
Type: Fail
Cause: "HandledError occurred"
403 Forbidden
Response {type: "cors", url: "https://{bucket-name}.s3.{region-id}.amazonaw…nedHeaders=host%3Bx-amz-acl&x-amz-acl=public-read", redirected: false, status: 403, ok: false, …} body: (...) bodyUsed: false headers: Headers {} ok: false redirected: false status: 403 statusText: "Forbidden" type: "cors" url: "https://{bucket-name}.s3.{region-id}.amazonaws.com/actionID.pdf?Content-Type=application%2Fpdf&X-Amz-Algorithm=SHA256&X-Amz-Credential=CREDZ-&X-Amz-Date=20190621T192558Z&X-Amz-Expires=900&X-Amz-Security-Token={token}&X-Amz-SignedHeaders=host%3Bx-amz-acl&x-amz-acl=public-read" proto: Response
I'm thinking the parameters supplied to the getSignedUrl
call using the AWS S3 SDK aren't correct, though they follow the structure suggested by AWS' docs (explained here). Aside from that, I'm really lost as to why my request is rejected. I've even tried exposing my Bucket to the public fully and it still didn't work.
After reading this, I tried to structure my PUT request like this:
let authFromGet = response.config.headers.Authorization;
const putHeaders = {
'Authorization': authFromGet,
'Content-Type': blobData,
'Expect': '100-continue',
};
...
const result = await fetch(response.data.uploadURL, {
method: 'put',
headers: putHeaders,
body: blobData,
});
this resulted in a 400 Bad Request
instead of a 403; different, but still wrong. It's apparent that putting any headers on the request is wrong.
Upvotes: 2
Views: 3413
Reputation: 475
Digging into this, it's because you are trying to upload an object with a public ACL into a bucket that doesn't allow public objects.
Optionally remove the public ACL statement or...
Ensure the bucket set to either
Basically, you cannot upload objects with a public ACL into a bucket where there is some restriction preventing that - you'll get the 403 error you describe. HTH.
Upvotes: 1