Reputation: 11
I recently learned that the PDF files and images I uploaded to my Heroku website were removed whenever I updated the website. Due to this, I have been trying to store my PDFs in my MongoDB database using Mongoengine (with Flask and Python), and then retrieving them and storing them in the static folder (I was able to successfully do this with my images), with no luck.
Below is the relevant code for my Mongoengine class:
class Article(Document):
uploaded_content = FileField() # Field for storing PDF
uploaded_content_name = StringField() # File name for PDF
The relevant code for my Flask route that is trying to store the PDF:
data = Article()
if request.files['uploaded-article']:
data.uploaded_content = request.files['uploaded-article']
# uploaded_content_name given random name below, and stored in
# database
And then here is my code that tries to retrieve the PDF from mongoengine, and save it to my blog folder:
articles = Article.objects()
for art in articles:
path = os.path.join(app.config['BLOG_FOLDER'], art.uploaded_content_name)
if not os.path.isfile(path):
f = open(art.uploaded_content.read(), 'wb') # This lines gives the error
f.save(os.path.join(app.config['BLOG_FOLDER'] + art.uploaded_content_name), "PDF")
The line that gives me the error is when I try to open the PDF file I stored in my database. I have tried many different ways and have gotten various errors, but one I get is:
No such file or directory: b''
. I can confirm that if I read() the database object, its just an empty byte string.
I have also tried changing my flask route to the code below, by storing the open PDF from Flask's request object. However, this gave me the error ValueError: embedded null byte
when I tried to open it. However, the read() method gave me at least a really long byte string.
data = Article()
if request.files['uploaded-article']:
# store the PDF in the blog folder
article_pdf = request.files['uploaded-article']
article_pdf.save(os.path.join(app.config['BLOG_FOLDER'], article_pdf_filename))
# Open the PDF just stored in the blog folder
with open(os.path.join(app.config['BLOG_FOLDER'], article_pdf_filename), 'rb') as f:
# Store the opened PDF in the database
data.uploaded_content.put(f)
f.close()
# uploaded_content_name given random name below, and stored in
# database
Another random thing I tried was trying to open the PDF file using the BytesIO data structure, but it resulted in the same error above of an embedded null byte.
Are there any suggestions for how I can properly store and retrieve my PDF from my mongoengine database? My apologies for the complexity of my question - however, if needed I can add more details. If there are any alternative ways of storing my PDFs so they do not get lost on Heroku, I would take that as a valid solution as well.
Upvotes: 0
Views: 646
Reputation: 11
As a reference for the future, it looks like this was not working because I did not set the content type correctly when putting the pdf in. My original code when saving the PDF to the data.uploaded_content field was:
data.uploaded_content.put(f)
However, I needed to define the mimetype correctly:
data.uploaded_content.put(f, content_type='application/pdf')
With this change it then worked, and I was able to successfully store the PDF in mongoengine. As far as storing the PDF to a folder after it was successfully uploaded, I used the following code:
if art.uploaded_content_name:
extension = art.uploaded_content_name.rsplit('.', 1)[1].lower()
path = os.path.join(app.config['BLOG_FOLDER'], art.uploaded_content_name)
if not os.path.isfile(path):
pdf = art.uploaded_content.read()
with open(os.path.join(app.config['BLOG_FOLDER'], art.uploaded_content_name), 'wb') as f:
f.write(pdf)
Upvotes: 1