Javittoxs
Javittoxs

Reputation: 531

Separating OCR text into lines with Python

What I'm trying to do is create a list of lines from a paragraph. The width of the lines cannot exceed a established amount of width. Here's a class that is supposed to solve this, here's the code:

from font import Font

class Text:
        def __init__(self, text, limit, size):
                self.text = text
                self.limit = limit
                self.size = size        
                self.setText()
        def setText(self):
                textList = self.text.split(' ')
                self.newList = tempo = []
                spaceWidth = Font(self.size, ' ').width
                count = 0
                for x in textList:
                        word = Font(self.size, x)
                        count = count + word.width + spaceWidth
                        if count >= self.limit:
                                self.newList.append(' '.join(tempo))
                                tempo = []; tempo = [x]
                                count = word.width
                        else:
                                tempo.append(x)
                self.newList.append(' '.join(tempo))

as you can see I'm using another class called Font, here it is:

from PIL import Image,ImageFont

class Font:
        def __init__(self, fontSize, text):
                self.font = ImageFont.truetype('tomnr.ttf', fontSize)
                self.width, self.height = self.font.getsize(text)

There are no execution errors in the code but the result is not correct: for example,

from text import Text

text = Text("Art inspired apparel for Creative Individuals. Do you SurVibe?", 452, 25)

print text.newList

What this code is supposed to do is to create lines that are max. width 452 pixels. It should print

['Art inspired apparel for Creative', 'Individuals. Do you SurVibe?']

but instead it prints:

['Art', 'inspired', 'apparel', 'for', 'Creative', 'Art inspired apparel for Creative', 'Individuals. Do you SurVibe?']

And I can't find out what's going on. I think my loop is fine and everything run smoothly! I'm pretty sure it's a silly mistake but couldn't figure it out on my own. Thanks in advance.

Upvotes: 0

Views: 678

Answers (1)

bav
bav

Reputation: 1623

Error is here:

self.newList = tempo = []

Both variables point to the same list.

Upvotes: 1

Related Questions