Reputation: 6625
After following this CodeProject sample we were able to hand-craft our own PDF 1.4 (Adobe 5.x) engine. It works well with Latin-based text (English, French, etc.).
But Chinese, Korean, Greek and other Unicode/non-Latin language scrambles the content. Specifically, our font details are invalid:
<</Type/Font/Subtype/Type1/BaseFont/Courier/Encoding/WinAnsiEncoding>>
The in-built Courier font cannot display Chinese, Korean, etc.
Is there a simple way to ask the PDF renderer to use an in-built font (like MS Gothic
) without having to embed or create font details in the PDF?
Alternatively, is there a PDF creation library or SDK (free or commercial) that will work in C# with the .NET Compact Framework. Specifically, any of the following:
Upvotes: 0
Views: 58
Reputation: 11857
It is possible to write a PDF as text, without sub-setting or embedding fonts, as long as the readers understand the Adobe references and the Fonts are correctly described. Traditionally the Greek core letters are actually in the "Symbol" font where they are used for science terms.
Ignore the background image below, as that would be a different problem.
Here is one result as a text based PDF. Too large to attach, but basically you can reverse the method from editing the source and output as text. Without the image the PDF asText.pdf
is 7 KB. Thus it is far smaller than with any fonts included.
For a short period the text based PDF with image is linked here https://filetransfer.io/data-package/kBQqilbj#link There are two variations without the binary image.
If I remove the more complex "Eastern European" text then the CJK is much simpler, as 2.3 KB and works well in a PDF enabled browser Etc. The key to it working is using UTF-16 as hexadecimal text. E.g. /UniJIS-UTF16-H <1234> etc.
%PDF-1.7
1 0 obj<</Type/Catalog/Pages 2 0 R>>endobj
2 0 obj<</Type/Pages/Count 1/Kids[14 0 R]>>endobj
3 0 obj<</Type/Font/Subtype/Type1/BaseFont/Times-Roman/Encoding/WinAnsiEncoding>>endobj
4 0 obj<</Type/Font/Subtype/Type0/BaseFont/Ming/Encoding/UniCNS-UTF16-H/DescendantFonts[5 0 R]>>endobj
5 0 obj<</Type/Font/Subtype/CIDFontType0/BaseFont/Ming/CIDSystemInfo<</Registry(Adobe)/Ordering(CNS1)/Supplement 7>>/FontDescriptor 6 0 R>>endobj
6 0 obj<</Type/FontDescriptor/FontName(Ming)/FontBBox[-200 -200 1200 1200]/Flags 6/ItalicAngle 0/Ascent 1000/Descent -200/StemV 80>>endobj
7 0 obj<</Type/Font/Subtype/Type0/BaseFont/Mincho/Encoding/UniJIS-UTF16-H/DescendantFonts[8 0 R]>>endobj
8 0 obj<</Type/Font/Subtype/CIDFontType0/BaseFont/Mincho/CIDSystemInfo<</Registry(Adobe)/Ordering(Japan1)/Supplement 6>>/FontDescriptor 9 0 R>>endobj
9 0 obj<</Type/FontDescriptor/FontName(Mincho)/FontBBox[-200 -200 1200 1200]/Flags 6/ItalicAngle 0/Ascent 1000/Descent -200/StemV 80>>endobj
10 0 obj<</Type/Font/Subtype/Type0/BaseFont/Batang/Encoding/UniKS-UTF16-H/DescendantFonts[11 0 R]>>endobj
11 0 obj<</Type/Font/Subtype/CIDFontType0/BaseFont/Batang/CIDSystemInfo<</Registry(Adobe)/Ordering(Korea1)/Supplement 2>>/FontDescriptor 12 0 R>>endobj
12 0 obj<</Type/FontDescriptor/FontName(Batang)/FontBBox[-200 -200 1200 1200]/Flags 6/ItalicAngle 0/Ascent 1000/Descent -200/StemV 80>>endobj
13 0 obj<</Length 365>>stream
q 50 50 m 100 200 l 200 50 l 1 0 0 rg f Q
q
0 0 1 rg BT /TmRm 24 Tf 1 0 0 1 50 760 Tm (Hello, world!) Tj ET
BT /Song 24 Tf 1 0 0 1 50 670 Tm <4f60597dff0c4e16754cff01> Tj ET
BT /Mincho 24 Tf 1 0 0 1 50 640 Tm <30533093306b3061306f> Tj ET
BT 1 0 0 1 50 615 Tm <30cf30ed30fc30ef30fc30eb30c9ff01> Tj ET
BT /Batang 24 Tf 1 0 0 1 50 580 Tm <c548b155d558c138c694> Tj ET
Q
endstream
endobj
14 0 obj <</Type/Page/MediaBox[0 0 595 842]/Rotate 0/Resources<</Font<</TmRm 3 0 R/Song 4 0 R/Mincho 7 0 R/Batang 10 0 R>>>>/Contents 13 0 R/Parent 2 0 R>>endobj
xref
0 15
0000000000 65536 f
0000000009 00000 n
0000000052 00000 n
0000000102 00000 n
0000000190 00000 n
0000000293 00000 n
0000000439 00000 n
0000000578 00000 n
0000000683 00000 n
0000000833 00000 n
0000000974 00000 n
0000001080 00000 n
0000001232 00000 n
0000001374 00000 n
0000001788 00000 n
trailer
<</Size 15/Root 1 0 R>>
startxref
1951
%%EOF
If you programmatically write or cut and paste the above as Unix terminated ANSI Text in MS Notepad (or other similar text editor). Then Adobe Acrobat Reader/editor DC will show the text fairly thin as unweighted. But a few extra characters at start of body text will sort that out, by thickening the strokes.
Upvotes: 0