Reputation: 9
enter image description hereI am trying to create a python script that will iterate over every page of a PDF and remove the watermark. Some PDF files are 500+ pages so the watermark needs to be manually removed from all pages before sending out to our clients. One issue that I am running into is that on some pages, the watermark is a textbox object while others are image objects. No way around that, thats just how are system prints these preview files.
I have tried writing a script using PyMuPDF that gets the pixel coordinates of the watermark and remove the item with those exact dimensions. It sort of worked however, not all of the water marks are the same (image vs text) so the dimensions were different. Also, I only want to remove the watermark and nothing underneath. If anyone has an idea on how I can move forward, I'd really appreciate it!
Upvotes: 0
Views: 90