Erik
Erik

Reputation: 276

Python-docx - insert picture into docx from URL

I am trying to grab an image hosted on a website(eg. imgur) and add it to a docx.

This is my initial code(this is part of a function. I've stripped it down to the relevant codes):

from PIL import Image
from urllib.request import urlopen

thisParagraph = document.sections[0].paragraphs[0]
run = thisParagraph.add_run()

# imgLink is a direct link to the image. Something like https://i.imgur.com/<name>.jpg
# online is a parsed-in boolean to determine if the image link is from an image hosting site
# or from the local machine
if (online):
   imgLinkData = urlopen(imgLink )
   img = Image.open(imgLinkData )
   width, height = img.size
else:
   img = Image.open(imgLink )
   width, height = img.size
   imgLinkData = imgLink 

if (width > 250) or (height > 250):
   if (height > width):
       run.add_picture(imgLinkData, width=Cm(3), height=Cm(4) )
   else:
       run.add_picture(imgLinkData, width=Cm(4), height=Cm(3) )
else:
       run.add_picture(imgLinkData)

For the most part, this works if imgLink is pointed to my local system(ie. the image is hosted on my PC).

But if I refer to a url link(online=True), I get various types of exceptions(in my attempt to fix it) ranging from io.UnsupportOperation (seek) to TypeError (string argument expected, got 'bytes'), the cause is always the run.add_picture line.

The code, as it is now, throws the io.UnsupportOperation exception.

Upvotes: 1

Views: 1660

Answers (2)

Erik
Erik

Reputation: 276

Think I may have solved the issue.

Based on this link, I made some slight modifications to my code.

I added:

import requests, io

Then I changed:

imgLinkData = urlopen(imgLink )

to

imgLinkData= io.BytesIO(requests.get(imgLink ).content )

And this seems to have successfully generated the image in my docx document, though I'm not exactly sure why, aside from the fact that the urlopen returned

<class 'http.client.HTTPResponse'>

and the requests.get returned

<class 'requests.models.Response'>

and .content returned a

<class 'bytes'>

object.

Further reading even seems to indicate against using urllib

Upvotes: 0

scanny
scanny

Reputation: 28893

Save the image to a file and then use the file path as the first argument to .add_picture(). This would be something roughly like:

img.save("my-image.jpg")
run.add_picture("my-image.jpg", width=Cm(3), height=Cm(4))

As an alternative, you could create an "in-memory" file (io.BytesIO) containing the image and use that. This second approach has the advantage of not requiring access to a filesystem.

import io
image_stream = io.BytesIO(imgLinkData)
run.add_picture(image_stream, width=Cm(3), height=Cm(4))

The interface to Document.add_picture() expects a str path or a file-like object (open file or in-memory file) as its first argument: https://python-docx.readthedocs.io/en/latest/api/document.html#docx.document.Document.add_picture

Upvotes: 1

Related Questions