user2496636
user2496636

Reputation: 51

Arabic text is not properly wrapped in reportlab's Paragraph flowable

Let’s say I have this Arabic snippet:

إذا أخذنا بعين الإعتبار طبيعة تقلب المناخ و المتغيرات البينية السنوية و تلك على المدى الطويل إضافة إلى عدم دقة القياسات والحسابات المتبعة

In English this should mean something like: “If we take into account the nature of climate variability and inter-annual variability and those on long-term addition to the lack of accuracy of measurements and calculations used….”

Now I want to render it as Reportlab PDF doc (python):

arabic_text = u'إذا أخذنا بعين الإعتبار طبيعة تقلب المناخ و المتغيرات البينية السنوية و تلك على المدى الطويل إضافة إلى عدم دقة القياسات والحسابات المتبعة'
arabic_text = arabic_reshaper.reshape(arabic_text) # join characters
arabic_text = get_display(arabic_text) # change orientation by using bidi

pdf_file=open('disclaimer.pdf','w')
pdf_doc = SimpleDocTemplate(pdf_file, pagesize=A4)
pdfmetrics.registerFont(TTFont('Arabic-normal', '../fonts/KacstOne.ttf'))
style = ParagraphStyle(name='Normal', fontName='Arabic-normal', fontSize=12, leading=12. * 1.2)
style.alignment=TA_RIGHT
pdf_doc.build([Paragraph(arabic_text, style)])
pdf_file.close()

The result is here https://www.dropbox.com/s/gdyt6930jlad8id/disclaimer.pdf. You can see the text itself is correct and readable (at least for Google Translate), but not wrapped as expected for RTL script.

Upvotes: 5

Views: 1923

Answers (3)

Shoaib Mirzaei
Shoaib Mirzaei

Reputation: 522

use below code. it will wrap it properly.

import arabic_reshaper
from bidi.algorithm import get_display
from reportlab.platypus import SimpleDocTemplate, Paragraph
from reportlab.pdfbase import pdfmetrics
from reportlab.lib.styles import ParagraphStyle
from reportlab.lib.enums import TA_RIGHT
from reportlab.lib.pagesizes import A4
from reportlab.pdfbase.ttfonts import TTFont
from reportlab.pdfbase.pdfmetrics import stringWidth
from reportlab.lib.units import inch

def text_wraping(text,aW):
    text_width = stringWidth(text, style.fontName, style.fontSize)
    space_width = stringWidth(' ', style.fontName, style.fontSize)
    if text_width > aW:
        lines = []
        text = arabic_text.split(' ')
        text.reverse()
        line = ''
        for word in text:
            line_width = stringWidth(line, style.fontName, style.fontSize)
            word_width = stringWidth(word, style.fontName, style.fontSize)
            if (line_width < aW) and (line_width + word_width + space_width < aW):
                line += word + ' '
            else:
                line = line.split(' ')
                line.reverse()
                tmp = ' '
                line = tmp.join(line)
                lines.append(line)
                line = word + ' '
        line = line.split(' ')
        line.reverse()
        tmp = ' '
        line = tmp.join(line)
        lines.append(line)
    return(lines)

pdf_doc = SimpleDocTemplate('disclaimer.pdf', pagesize=A4)
pdfmetrics.registerFont(TTFont('Aims', 'Aims.ttf'))
style = ParagraphStyle(name='Normal', fontName='Aims', fontSize=12, leading=12. * 1.2,wordWrap='RTL')
style.alignment=TA_RIGHT
aW = A4[0]-2*inch  # change this line to the specified width for example paragraph width or table width
arabic_text = 'إذا أخذنا بعين الإعتبار طبيعة تقلب المناخ و المتغيرات البينية السنوية و تلك على المدى الطويل إضافة إلى عدم دقة القياسات والحسابات المتبعة'
arabic_text = arabic_reshaper.reshape(arabic_text) # join characters
arabic_text = get_display(arabic_text) # change orientation by using bidi
lines = text_wraping(arabic_text,aW)
elements = []
for line in lines:
    elements.append(Paragraph(line,style))
pdf_doc.build(elements)

Upvotes: 2

Michael Dowd
Michael Dowd

Reputation: 1

Using the wordwrap module and <br> markup to split up lines; this is not perfect because there is a blank line at the top of each paragraph, but it is a simple solution to some use cases

import textwrap
def ShowArabictext(Text):


#style_comment.alignment = TA_RIGHT
wrkText=Text

isArabic=False
isBidi=False

for c in wrkText:
    cat=unicodedata.bidirectional(c)

    if cat=="AL" or cat=="AN":
        isArabic=True
        isBidi=True
        break
    elif cat=="R" or cat=="RLE" or cat=="RLO":
        isBidi=True

if isArabic:

    #wrkText=arabic_table(wrkText)    

    wrkText=textwrap.wrap( wrkText,70)
    wrkTexttemp=[]
    l=u''
    i=0
    for w in wrkText:
        # break each line with html markup allowed in reportlab 
        l=l+u'<br></br>'+arabic_rtlize.process.shape(arabic_reshaper.reshape(w ))

    wrkText=l



if isBidi:
    wrkText=get_display(wrkText)

return [wrkText,isArabic,isBidi]

Upvotes: 0

MrMMM
MrMMM

Reputation: 11

If you use this branch of Report lab that adds RTL support

and remove this line from your code:

arabic_text = get_display(arabic_text) # change orientation by using bidi

your code will run properly

since they have already been solved with the branch using PyFriBiDi as you can see here:

Some users from the community notably Ury Marshak, Moshe Wagner and Hosam Aly have contributed batches to get PyFriBibi working with ReportLab. We have created an SVN branch for this development and it is available at ...

(The SVN link on that page doesn't work anymore so you should use the bitbucket link I've included!)

I have managed to run this modified version of your code and produce the correct result :

from libs import arabic_reshaper
from bidi.algorithm import get_display
from reportlab.platypus import SimpleDocTemplate, Paragraph
from reportlab.pdfbase import pdfmetrics
from reportlab.lib.styles import ParagraphStyle
from reportlab.lib.enums import TA_RIGHT
from reportlab.lib.pagesizes import A4
from reportlab.pdfbase.ttfonts import TTFont
arabic_text = u'إذا أخذنا بعين الإعتبار طبيعة تقلب المناخ و المتغيرات البينية السنوية و تلك على المدى الطويل إضافة إلى عدم دقة القياسات والحسابات المتبعة'
arabic_text = arabic_reshaper.reshape(arabic_text) # join characters
# arabic_text = get_display(arabic_text) # change orientation by using bidi

pdf_file=open('disclaimer.pdf','w')
pdf_doc = SimpleDocTemplate(pdf_file, pagesize=A4)
pdfmetrics.registerFont(TTFont('Arabic-normal', 'fonts/misc/KacstOne.ttf'))
style = ParagraphStyle(name='Normal', fontName='Arabic-normal', fontSize=12, leading=12. * 1.2)
style.alignment=TA_RIGHT
pdf_doc.build([Paragraph(arabic_text, style)])
pdf_file.close()

Upvotes: 1

Related Questions