Reputation: 525
Sorry for asking repeated question, because they didn't solve my problem which was already asked here before , How to convert pdf file from s3 to string variable using lambda function ,
My lambda function show the error
I find the below code in this answer but I am stuck in implement this code in lambda, please share your idea and I thing if the code in below is correct , the data variable will contain the string conversion of the pdf file in s3 . if No please give some suggestion to change my code
Unable to import module 'lambda_function': No module named 'pdfminer'
import json
import boto3
import botocore
import sys
from pdfminer.pdfinterp import PDFResourceManager, PDFPageInterpreter
from pdfminer.pdfpage import PDFPage
from pdfminer.converter import XMLConverter, HTMLConverter, TextConverter
from pdfminer.layout import LAParams
import io
s3 = boto3.client('s3')
def lambda_handler(event, context):
bucket = event['Records'][0]['s3']['bucket']['name']
key = event['Records'][0]['s3']['object']['key']
filename = 'myfile'
s3.download_file(bucket,key, '/tmp/'+filename)
print('reading')
fp = open('/tmp/'+filename, 'rU').read()
rsrcmgr = PDFResourceManager()
retstr = io.StringIO()
codec = 'utf-8'
laparams = LAParams()
device = TextConverter(rsrcmgr, retstr, codec=codec, laparams=laparams)
# Create a PDF interpreter object.
interpreter = PDFPageInterpreter(rsrcmgr, device)
# Process each page contained in the document.
for page in PDFPage.get_pages(fp):
interpreter.process_page(page)
data = retstr.getvalue()
print(data)
Upvotes: 0
Views: 1527
Reputation: 7356
The problem here is that your lambda function is unable to find pdfminer library. This library is not present in the lambda container. In order to overcome this, you need to install the library in the root of your application (where your lambda_handler file is present). To do this, there are 2 ways:
pip install pdfminter -t ./
pip install -r requirements.txt -t ./
It is always recommended that you run the above commands in a virtual environment.
References:
Upvotes: 2