Reputation: 1
I'm trying to extract images from an invoice for an equipment order and each time I run the code I only get 4 of 8 or 9 total photos on each page. Are there some PDFs that are just not compatible with some of PyMuPDF's functions?
def extract_images(model_nums, file):
image_num = 0
doc = fitz.open(file)
# new directories that will hold images
all_path = os.path.join(os.getcwd(), "All Files")
if not os.path.exists('All Files'):
os.mkdir(all_path)
if not os.path.exists(sport_id):
os.mkdir(sport_path)
for i in range(doc.page_count):
print("Page: "+ str(i))
images = doc.get_page_images(i)
for img in images:
xref = img[0]
pix = fitz.Pixmap(doc, xref)
pix.save(f"{all_path}/{model_nums[image_num]}.jpg")
pix = None
image_num += 1
I've tried even searching up different code from other people that will just count the number of images and came up with the same issue.
Upvotes: 0
Views: 987