user4602228
user4602228

Reputation:

PDF syntax BT ET text operations

Looking at the PDF syntax stream BT ET I have some questions, I couldn't understand from the PDF32000 operation manual :

A good online resource link would also be great...

BT /F1 24 Tf ... (My String is here)Tj ET

Looking at this piece of code,

I know I could use PDFKit and PDFJS etc... but I would really like to know how PDF syntax works, the online PDF32000 manual is really long and complex...

Upvotes: 0

Views: 4168

Answers (1)

mkl
mkl

Reputation: 95918

First of all, Michaël of course is right saying

Unfortunately, if you want to understand how PDF syntax works, you'll need to read the specification. You could of course only read the part that applies to text, Chapter 9, and probably 9.2. But this chapter of course assumes that you have knowledge of how PDF works structurally and what the types of objects are. I suggest reading it.

To give you some impressions, though, here some answers to your questions...

Line breaks

How do I insert line breaks? using \n\r didn't help

You draw text on different lines by breaking it into separate strings, one for each line, and drawing them separately, advancing to the next line in-between by repositioning the current text position. There are different ways to do this repositioning. E.g.

(Line 1 text) Tj
0 -20 Td
(Line 2 Text) Tj

Font weight

How can I change the font-weight to bold in the middle of the string

By selecting a bold font, e.g.

/MyNormalFont 12 Tf
(Normal text - ) Tj
/MyBoldFont 12 Tf
(bold text) Tj
/MyNormalFont 12 Tf
( - normal text again) Tj

you can output "Normal text - bold text - normal text again".

(There also are poor-man's-bold effects like double printing the letters with a small offset...)

Other languages

Tried languages other than english and couldn't get it right, how can I do other languages, hebrew, arabic, chinese, etc... changing the font didn't help, encoding to UTF16 didn't help either, should I encode to something different and set encoding somewhere?

You have to declare fonts to use them in content streams. In this declarations you in particular define the encoding to use for text drawn using the font in question. For the samples above the encoding must have been something ASCII'ish, e.g. WinAnsiEncoding, but you'll often find other encodings, in particular for non-English text.

For this you have to consider the Resources of type Font. For details cf. the specification chapter 9.

See also the example under "A non-Latin character" below.

Limit text width

How to limit the text to a certain width

By drawing few enough characters.

See above, before putting a string drawing instruction into the content stream, you have to split it up into lines. Simply choose these lines as short as necessary.

Line height

How to set line-height properties

Do you mean the distance from the base of one line to the base of the next? Or do you mean the font height?

How the former distance is selected, depends on how you go to the next line, see above under "Line breaks". If you do so like in the example there, you go down to the next line by 20 units using

0 -20 Td

You set the latter height, the font height, in the font selection instruction, e.g. in "Font weight" above

/MyNormalFont 12 Tf

selects MyNormalFont at a size of 12 units.

Concerning those units: A unit usually starts out as 1/72 inch but by changing the transformation matrix (cf. section 8 of the specification) you can change it.

Multiple fonts

Mixing multiple fonts in the same line

See "Font weights" above, the different font weights are implemented using different fonts.

A non-Latin character

From a comment:

could you give an example of inserting a foreign language character other than english?

One option you have is to create a PDF font that maps the characters you need from a given font program by name into the 0..255 range for a single-byte encoding. This is ok for Hebrew or Arabic writing but less so for CJK writing.

As you asked for only one character, I only put a single character in the example... Furthermore I use Arial and expect the PDF viewer in question to find in the system at hand, i.e. I don't embed it.

Thus, for a font with the Arabic character alef maksura named alefmaksuraarabic in the Adobe Glyph List put at code 32 (the space in ASCII derived encodings), you can use:

1 0 obj
<<
  /Type /Font
  /Subtype /TrueType
  /BaseFont /Arial
  /Encoding
  <<
    /BaseEncoding /WinAnsiEncoding
    /Differences [ 32 /alefmaksuraarabic ]
  >>
  /FirstChar 32
  /LastChar 32
  /FontDescriptor 2 0 R
  /Widths [ 600 ]
>> 
endobj
2 0 obj
<<
  /Type /FontDescriptor
  /FontName /Arial
  /StemV 44
  /Leading 33
  /Ascent 905
  /Flags 32
  /XHeight 250
  /FontWeight 400
  /AvgWidth 441
  /Descent -210
  /CapHeight 728
  /MaxWidth 2665
  /FontBBox [-665 -210 2000 728]
  /ItalicAngle 0
>>
endobj

For a standard Times-Roman/WinAnsiEncoding font as font resource F and the font defined above as font resource G, you can write

BT
/F 12 Tf
15 815 Td
(Test: ) Tj
/G 12 Tf
( ) Tj
ET

into your content stream and get

screen shot

Upvotes: 7

Related Questions