Reputation: 117
I'm trying to extract information from a word doc and write the paragraphs from there into images with borders and in the center of the image. In addition to this, I'm trying to save the image after each iteration, but with a different name, preferably like "patentDayMonthYearTime.png" that way it simply goes in chronological order.
import time
from PIL import Image
from PIL import ImageFont
from PIL import ImageDraw
from docx import Document
doc = Document('test.docx')
fullText = []
for para in doc.paragraphs:
W, H = 300, 300
body = Image.new('RGB', (W, H), (255, 255, 255))
border = Image.new('RGB', (W+2, H+2), (0, 0, 0))
border.save('border.png')
body.save('body.png')
patent = Image.open('border.png')
patent.paste(body, (1, 1))
draw = ImageDraw.Draw(patent)
font = ImageFont.load_default()
text = para.text.encode('utf-8')
ch, pad = 60, 20
for line in text:
w, h = draw.textsize(line, font=font)
draw.text(((W-w)/2, ch), line, (0, 0, 0), font=font)
ch += h + pad
date = time.strftime("%Y-%m-%d-%H:%M")
patent.save('patent.png')
Above is my current code, and with it, I'm receiving the following error:
Traceback (most recent call last): File "C:/Users/crazy/PycharmProjects/Patent/patent.py", line 28, in w, h = draw.textsize(line, font=font)
File "C:\Users\crazy\PycharmProjects\Patent\venv\lib\site-packages\PIL\ImageDraw.py", line 423, in textsize if self._multiline_check(text):
File "C:\Users\crazy\PycharmProjects\Patent\venv\lib\site-packages\PIL\ImageDraw.py", line 258, in _multiline_check return split_character in text TypeError: argument of type 'int' is not iterable
Here is the document I'm currently working with:
Upvotes: 0
Views: 270
Reputation: 8260
You're iterating over each letter, not each paragraph. I've fixed it and also added date into output filename. Here is the code:
import time
import textwrap
from docx import Document
from PIL import Image, ImageFont, ImageDraw
W, H = 300, 300
doc = Document('file.docx')
fullText = []
for i, p in enumerate(doc.paragraphs):
body = Image.new('RGB', (W, H), (255, 255, 255))
patent = Image.new('RGB', (W + 2, H + 2), (0, 0, 0))
patent.paste(body, (1, 1))
draw = ImageDraw.Draw(patent)
font = ImageFont.load_default()
current_h, pad = 60, 20
for line in textwrap.wrap(p.text, width=40):
w, h = draw.textsize(line, font=font)
draw.text(((W - w) / 2, current_h), line, (0, 0, 0), font=font)
current_h += h + pad
patent.save(f'patent_{i+1}_{time.strftime("%Y%m%d%H%M%S")}.png')
Output files:
patent_1_20201211092732.png
:
patent_2_20201211092732.png
:
patent_3_20201211092732.png
:
Upvotes: 0