Reputation:
I'm trying to add text to a pdf by opening the PDF, adding a text box, and saving it. When I run the code, nothing happens. on the desktop, it shows the file has been updated, but there is no text displayed on it.
Here's the code:
import fitz
doc = fitz.open("/Users/khaylablack/Desktop/participant_certificate.pdf")
page = doc[0] # choose some page
rect = fitz.Rect(50, 100, 200, 200) # rectangle (left, top, right, bottom) in pixels
text = "absolutely not"
rc = page.insertTextbox(rect, text, fontsize = 48, # choose fontsize (float)
fontname = "Times-Roman", # a PDF standard font
fontfile = None, # could be a file on your system
align = 1) # 0 = left, 1 = center, 2 = right
#doc.saveIncr() # update file. Save to new instead by doc.save("new.pdf",...)
doc.save("/Users/khaylablack/Desktop/watermarked_participant_cert.pdf")
Upvotes: 3
Views: 15348
Reputation: 545
I had the same problem and, thanks to @JorjMcKie, I was able to solve it adding the call
page.clean_contents(False)
after the load of the page.
From the pyMuPDF official documentation:
Page.clean_contents(sanitize=True)
- Changed in v1.17.6
PDF only: Clean and concatenate all
contents
objects associated with this page. “Cleaning” includes syntactical corrections, standardizations and “pretty printing” of the contents stream. Discrepancies betweencontents
andresources
objects will also be corrected if sanitize is true. See Page.get_contents() for more details.Changed in version 1.16.0: Annotations are no longer implicitly cleaned by this method. Use
Annot.clean_contents()
separately.Parameters: sanitize (bool) – (new in v1.17.6) if true, synchronization between resources and their actual use in the contents object is snychronized. For example, if a font is not actually used for any text of the page, then it will be deleted from the /Resources/Font object.
Warning:
This is a complex function which may generate large amounts of new data and render old data unused. It is not recommended using it together with the incremental save option. Also note that the resulting singleton new /Contents object is uncompressed. So you should save to a new file using options “deflate=True, garbage=3”.
Upvotes: 0
Reputation: 61
When using insertTextbox() method, you have to be sure that the rect you create can contain the text, because if not, you will end up with no text displayed. One way to do this, is checking the text's lenght for your implementation, like this:
text = "absolutely not"
text_lenght = fitz.getTextlength(text, fontname="Times-Roman", fontsize=48))
text_lenght will be 270.67.
But the rect's width you have is just 150 (200-50). Also, even changing the rect's width won't work, because it's still too short, so you need also to increase it's height. Another thing you can do is just change the fontsize you're using. These two alternatives will look like this:
fontsize_to_use = 48
text = "absolutely not"
fontname_to_use = "Times-Roman"
text_lenght = fitz.getTextlength(text,
fontname=fontname_to_use,
fontsize=fontsize_to_use)
rect_x1 = 50
rect_y1 = 100
rect_x2 = rect_x1 + text_lenght + 2 # needs margin
rect_y2 = rect_y1 + fontsize_to_use + 2 # needs margin
rect = (rect_x1, rect_y1, rect_x2, rect_y2)
## Uncomment if you wish to display rect
# page.drawRect(rect,color=(.25,1,0.25))
rc = page.insertTextbox(rect, text,
fontsize=fontsize_to_use,
fontname=fontname_to_use,
align=1)
Alternative 2, smaller fontsize:
text = "absolutely not"
fontname_to_use = "Times-Roman"
rect_x1 = 50
rect_y1 = 100
rect_x2 = 200
rect_y2 = 200
rect_width = rect_x2 - rect_x1
rect_height = rect_y2 - rect_y1
rect = (rect_x1, rect_y1, rect_x2, rect_y2)
fontsize_to_use = rect_width/len(text)*2 # *2 just because 1pt is too small for a char. It mantains a good ratio for rect's width with larger text, but behaviour is not assured.
## Uncomment if you wish to display rect
# page.drawRect(rect,color=(.25,1,0.25))
rc = page.insertTextbox(rect, text,
fontsize=fontsize_to_use,
fontname=fontname_to_use,
align=1)
Note: rc is the unused rectangle height, it can also be negative, in your case it was -5.59, which means text was exceeding rect's height.
Upvotes: 6
Reputation: 13
Have you installed both pymupdf and fitz? If not try running these in your command prompt.
pip install pymupdf
pip install fitz
Upvotes: -6