Reputation: 276
I am trying to grab an image hosted on a website(eg. imgur) and add it to a docx.
This is my initial code(this is part of a function. I've stripped it down to the relevant codes):
from PIL import Image
from urllib.request import urlopen
thisParagraph = document.sections[0].paragraphs[0]
run = thisParagraph.add_run()
# imgLink is a direct link to the image. Something like https://i.imgur.com/<name>.jpg
# online is a parsed-in boolean to determine if the image link is from an image hosting site
# or from the local machine
if (online):
imgLinkData = urlopen(imgLink )
img = Image.open(imgLinkData )
width, height = img.size
else:
img = Image.open(imgLink )
width, height = img.size
imgLinkData = imgLink
if (width > 250) or (height > 250):
if (height > width):
run.add_picture(imgLinkData, width=Cm(3), height=Cm(4) )
else:
run.add_picture(imgLinkData, width=Cm(4), height=Cm(3) )
else:
run.add_picture(imgLinkData)
For the most part, this works if imgLink is pointed to my local system(ie. the image is hosted on my PC).
But if I refer to a url link(online=True), I get various types of exceptions(in my attempt to fix it) ranging from io.UnsupportOperation
(seek) to TypeError
(string argument expected, got 'bytes'), the cause is always the run.add_picture
line.
The code, as it is now, throws the io.UnsupportOperation
exception.
Upvotes: 1
Views: 1660
Reputation: 276
Think I may have solved the issue.
Based on this link, I made some slight modifications to my code.
I added:
import requests, io
Then I changed:
imgLinkData = urlopen(imgLink )
to
imgLinkData= io.BytesIO(requests.get(imgLink ).content )
And this seems to have successfully generated the image in my docx document, though I'm not exactly sure why, aside from the fact that the urlopen returned
<class 'http.client.HTTPResponse'>
and the requests.get returned
<class 'requests.models.Response'>
and .content returned a
<class 'bytes'>
object.
Further reading even seems to indicate against using urllib
Upvotes: 0
Reputation: 28893
Save the image to a file and then use the file path as the first argument to .add_picture()
. This would be something roughly like:
img.save("my-image.jpg")
run.add_picture("my-image.jpg", width=Cm(3), height=Cm(4))
As an alternative, you could create an "in-memory" file (io.BytesIO
) containing the image and use that. This second approach has the advantage of not requiring access to a filesystem.
import io
image_stream = io.BytesIO(imgLinkData)
run.add_picture(image_stream, width=Cm(3), height=Cm(4))
The interface to Document.add_picture()
expects a str path or a file-like object (open file or in-memory file) as its first argument: https://python-docx.readthedocs.io/en/latest/api/document.html#docx.document.Document.add_picture
Upvotes: 1