Reputation: 105
I'am trying to remove the watermark from the PDF by using a python code and the code that i am running is
I am using PyMuPDF
and have used fitz
library.
def remove_img_on_pdf(idoc, page):
#image list
img_list = idoc.getPageImageList(page)
con_list = idoc[page]._getContents()
# xref 274 is the only /Contents object of the page (could be
for i in con_list:
c = idoc._getXrefStream(i) # read the stream source
#print(c)
if c != None:
for v in img_list:
arr = bytes(v[7], 'utf-8')
r = c.find(arr) # try find the image display command
if r != -1:
cnew = c.replace(arr, b"")
idoc._updateStream(i, cnew)
c = idoc._getXrefStream(i)
return idoc
doc=fitz.open('ELN_Mod3AzDOCUMENTS.PDF')
rdoc = remove_img_on_pdf(doc, 0) #first page
rdoc.save('no_img_example.PDF')
I get this error saying
Traceback (most recent call last):
File "watermark.py", line 27, in <module>
rdoc = remove_img_on_pdf(doc, 0) #first page
File "watermark.py", line 5, in remove_img_on_pdf
con_list = idoc[page]._getContents()
AttributeError: 'Page' object has no attribute '_getContents'
Please help me find out a solution out of this, thank you in advance.
Upvotes: 0
Views: 17349
Reputation: 1
Ïf you print dir(idoc[0]) , you see a list of attributes. you should use idoc[page].get_contents() instead.
Upvotes: 0
Reputation: 36
Your function have some strange methods such as _getContents
, _getXrefStream
and _updateStream
, maybe they are deprecated or somthing, but here is working code for solving your problem:
import fitz
def remove_img_on_pdf(idoc, page):
img_list = idoc.getPageImageList(page)
con_list = idoc[page].get_contents()
for i in con_list:
c = idoc.xref_stream(i)
if c != None:
for v in img_list:
arr = bytes(v[7], 'utf-8')
r = c.find(arr)
if r != -1:
cnew = c.replace(arr, b"")
idoc.update_stream(i, cnew)
c = idoc.xref_stream(i)
return idoc
doc = fitz.open('ELN_Mod3AzDOCUMENTS.PDF')
rdoc = remove_img_on_pdf(doc, 0)
rdoc.save('no_img_example.PDF')
As you can see, I've used another methods instead of non-working ones. Also, here is documentation for PyMuPDF
.
Upvotes: 1