Reputation: 494
I'm uploading multiple files to flask using a form, I'm getting the file objects in the flask backend without a problem but the issue is I want to read the PDF files to extract text from them. I can't do it on the file objects I received from the form, another method I thought of was saving the file in the local storage then read them again when I did that using file.save(path, filename) it created an empty text file with the name - filename.pdf
app=Flask(__name__)
@app.route('/')
def index():
return '''
<form method='POST' action='/saveData'>
<input type='file' name='testReport'>
<input type='submit'>
</form>
'''
@app.route('/saveData', methods=['POST'])
def saveData():
if 'testReport' in request.files:
testReport= request.files['testReport']
#This isn't working, a text file is saved with the same name ,ending in pdf
testReport.save(os.path.join(app.config['UPLOAD_FOLDER'], testReport.filename))
return f'<h1>File saved {testReport.filename}</h1>'
else:
return 'Not done'
How do we operate on PDF files after uploading them to flask ?
Upvotes: 3
Views: 4970
Reputation: 93
You can directly follow the flask own way as mentioned [here]
This easily works with pdfs. Just don't forget to include your extension in ALLOWED_EXTENSIONS
Upvotes: 0
Reputation: 36360
How do we operate on PDF files after uploading them to flask ?
You should treat them just like normal PDF files - if they were uploaded via Flask application or gathered using other method is irrelevant here. As you
want to read the PDF files to extract text from them.
you should use PDF text-extraction tool, for example pdfminer.six, as this is external module you need to install it first: pip install pdfminer.six
Upvotes: 1