Reputation: 1728
I have AWS Lambda set up.
def lambda_handler(event, context):
return {
'statusCode': 200,
'body': json.dumps(event)
}
I would like to POST
in a PDF file so that I can operate on it in my lambda function.
Here is my POST
code
import requests
headers = {
'X-API-KEY':'1234',
'Content-type': 'multipart/form-data'}
files = {
'document': open('my.pdf', 'rb')
}
r = requests.post(url, files=files, headers=headers)
display(r)
display(r.text)
I am getting the error:
<Response [400]>
'{"message": "Could not parse request body into json: Unexpected character (\\\'-\\\' (code 45)) in numeric value: expected digit (0-9) to follow minus sign, for valid numeric value
How can I POST
over my PDF and be able to properly send over my PDF and access it in Lambda?
Note:
I am successful if I do this:
payload = '{"key1": "val1","key2": 22,"key3": 15,"key4": "val4"}'
r = requests.post(url = URL, data=payload, headers=HEADERS)
It is just the PDF part which I can't get
Upvotes: 1
Views: 3636
Reputation: 150
Encode and decode your PDF. There is multiple way to do, but mine is creating endpoint using FastAPI. Within the endpoint, I will create a tempfile for the PDF that will be encode to byte.
# create tempfile
temp_dir = tempfile.mkdtemp()
temp_file_path = os.path.join(temp_dir, pdf_file.filename)
# encode
content = base64.b64encode(pdf_file.file.read())
with open(temp_file_path, "wb") as temp_file:
temp_file.write(content)
# decode
content = base64.b64decode(content)
buffer = io.BytesIO()
buffer.write(content)
I am using AWS SAM CLI. Hence, it has given me a template.yml
file. Within the file, below the BinaryMediaTypes
, add the following list;
BinaryMediaTypes:
- application/pdf
- multipart/form-data
Upvotes: 0
Reputation: 31
I found this worked quite well for me:
import requests
file_loc = 'path/to/test.pdf'
data = open(file_loc,'rb').read() #this is a bytes object
r = requests.post(url, data=data)
r.ok #returns True (also a good idea to check r.text
#one-liner
requests.post(url, data=open(file_loc,'rb').read())
import io, base64
body = event["body"]
attachment = base64.b64decode(body.encode()) #this is a bytes object
buff = io.BytesIO(attachment) #this is now useable - read/write etc.
#one-liner
buff = io.BytesIO(base64.b64decode(event["body"].encode()))
Not quite sure why, but for me base64 encoding (even with urlsafe) in the original request corrupted the file and it was no longer recognised as a PDF in Lambda, so the OP's answer didn't work for me.
Upvotes: 1
Reputation: 1728
I figured it out. Took me a ton of time but I think I got it. Essentially it's all about encoding and decoding as bytes. Didn't have to touch the API Gateway at all.
Request:
HEADERS = {'X-API-KEY': '12345'}
data = '{"body" : "%s"}' % base64.b64encode(open(path, 'rb').read())
r = requests.post(url, data=data, headers=HEADERS)
In lambda
from io import BytesIO
def lambda_handler(event, context):
pdf64 = event["body"]
# Need this line as it does 'b'b'pdfdatacontent'.
pdf64 = pdf64[2:].encode('utf-8')
buffer = BytesIO()
content = base64.b64decode(pdf64)
buffer.write(content)
Upvotes: 3