Reputation: 189
I can extract all the images from the shapes of a slide, as shown in the code below. The problem comes when an image is embedded in a placeholder. I have no idea how to get the images from those placeholders, and the documentation to me isn't clear.
Note also that I have a minimum width limit for the kind of images I want, hence I have "shape.width > 250000" in the code
import os
import pptx
from pptx.enum.shapes import MSO_SHAPE_TYPE
ppFileName = "Test.pptx"
directory = os.path.dirname(__file__)
imageDirectory = directory + "\\Images " + ppFileName.replace(".pptx","")
if not os.path.exists(imageDirectory):
os.makedirs(imageDirectory)
def saveImage(shape,slideNumber,imageNumber):
image = shape.image
imageBytes = image.blob
imageFileName = f"Slide {slideNumber} Image {imageNumber}.{image.ext}"
imagePath = imageDirectory + "\\" + imageFileName
with open(imagePath, 'wb') as file:
file.write(imageBytes)
imageNumber += 1
prs = pptx.Presentation(directory + "\\" + ppFileName)
slides = prs.slides
slideNumber = 0
for slide in slides:
imageNumber = 0
for shape in slide.shapes:
if shape.shape_type == MSO_SHAPE_TYPE.PICTURE and shape.width > 250000:
saveImage(shape,slideNumber,imageNumber)
elif shape.shape_type == MSO_SHAPE_TYPE.GROUP and shape.width > 250000:
for s in shape.shapes:
saveImage(s,slideNumber,imageNumber)
slideNumber += 1
Upvotes: 0
Views: 319
Reputation: 189
Alright, I figured it out.
Just added these three lines of code:
for shape in slide.placeholders:
if hasattr(shape, "image") and shape.width > 250000:
saveImage(shape,slideNumber,imageNumber)
Upvotes: 1