Kirill Smirnov
Kirill Smirnov

Reputation: 1532

Java apache fop 2.2 incorrect rendering of some cyrillic characters

I got stuck with a problem that I can't resolve myself. I tried to simplify source code as much as I could and here is what I came up with - https://www.dropbox.com/s/ey3f65c4iby7ccn/fop_example.zip.

Here is the main piece of code (code of the template)

<?xml version="1.0" encoding="UTF-8"?>
<fo:root xmlns:fo="http://www.w3.org/1999/XSL/Format" font-family="Arial">
        <fo:layout-master-set>
            <fo:simple-page-master master-name="simpleA4" page-height="29.7cm" page-width="21cm">
                <fo:region-body reference-orientation="0"/>
            </fo:simple-page-master>
        </fo:layout-master-set>
        <fo:page-sequence master-reference="simpleA4">
            <fo:flow flow-name="xsl-region-body">
                <fo:block-container>
                    <fo:block>
                        ИмяпассажираКУЛЬДЮШЕВАЛИЯАЛЕКСАНДРОВНАДокументудостоверяющийличностьНомербилетаДоСОЧИСОЧРейсИЖВылетАВГКлассЭРЕГ№ВАЖНАЯИНФОРМАЦИЯ
                    </fo:block>
                </fo:block-container>
            </fo:flow>
        </fo:page-sequence>
</fo:root>

I can't simplify this long text because if I remove any character everything will work fine. So the problem is with the last letters. Instead of "ИНФОРМАЦИЯ" I get "ИНФОРМ~ИЯ" and if I remove or add any other cyrillic letter everything will be ok, so I guess the problem isn't with fonts.

enter image description here

Why's that? Please help me, I have no idea what's wrong or how to fix it.

P.S. Here is a link to the resulting pdf, maybe you could say what's wrong by simply looking at this file.

P.P.S Tried to replace this text with &#x0418;&#x043c;&#x044f;&#x043f;&#x0430;&#x0441;&#x0441;&#x0430;&#x0436;&#x0438;&#x0440;&#x0430;&#x041a;&#x0423;&#x041b;&#x042c;&#x0414;&#x042e;&#x0428;&#x0415;&#x0412;&#x0410;&#x041b;&#x0418;&#x042f;&#x0410;&#x041b;&#x0415;&#x041a;&#x0421;&#x0410;&#x041d;&#x0414;&#x0420;&#x041e;&#x0412;&#x041d;&#x0410;&#x0414;&#x043e;&#x043a;&#x0443;&#x043c;&#x0435;&#x043d;&#x0442;&#x0443;&#x0434;&#x043e;&#x0441;&#x0442;&#x043e;&#x0432;&#x0435;&#x0440;&#x044f;&#x044e;&#x0449;&#x0438;&#x0439;&#x043b;&#x0438;&#x0447;&#x043d;&#x043e;&#x0441;&#x0442;&#x044c;&#x041d;&#x043e;&#x043c;&#x0435;&#x0440;&#x0431;&#x0438;&#x043b;&#x0435;&#x0442;&#x0430;&#x0414;&#x043e;&#x0421;&#x041e;&#x0427;&#x0418;&#x0421;&#x041e;&#x0427;&#x0420;&#x0435;&#x0439;&#x0441;&#x0418;&#x0416;&#x0412;&#x044b;&#x043b;&#x0435;&#x0442;&#x0410;&#x0412;&#x0413;&#x041a;&#x043b;&#x0430;&#x0441;&#x0441;&#x042d;&#x0420;&#x0415;&#x0413;&#x2116;&#x0412;&#x0410;&#x0416;&#x041d;&#x0410;&#x042f;&#x0418;&#x041d;&#x0424;&#x041e;&#x0420;&#x041c;&#x0410;&#x0426;&#x0418;&#x042f;, still get the same result.

Text with only problem characters presented in unicode:

ИмяпассажираКУЛЬДЮШЕВАЛИЯАЛЕКСАНДРОВНАДокументудостоверяющийличностьНомербилетаДоСОЧИСОЧРейсИЖВылетАВГКлассЭРЕГ№ВАЖНАЯИНФОРМ&#x0410;&#x0426;&#x0418;Я

I managed to do the example even shorter:

ИмяпсжираКУЛЬДЮШЕВАЯкудсвющийличньорбилетаСЧВыЭГ№ЖНФОРМАЦИЯ

Upvotes: 2

Views: 897

Answers (1)

Kirill Smirnov
Kirill Smirnov

Reputation: 1532

It turns out the problem was because of the incorrect encoding mode.

<font kerning="yes" embed-url="/arial.ttf" encoding-mode="single-byte">
    <font-triplet name="Arial" style="normal" weight="normal"/>
</font>

I should have used cid instead of single-byte because I embed .ttf (TrueType) and according to the documentation default (and I think that means preferable) option is

"cid" for Truetype, "single-byte" for Type 1

Hovewer, I suppose it'a bug of the library, because if I want to embed the font completely I have to use the single-byte mode.

When embedding TrueType (ttf) or TrueType Collections (ttc), a subset of the original font, containing only the glyphs used, is embedded in the output document. That's the default, but if you specify encoding-mode="single-byte" (see above), the complete font is embedded.

Upvotes: 2

Related Questions