Reputation: 2377
I am using pdf.js. Fetching the Text I get blocks with font info
Object {
str: "blabla",
dir: "ltr",
width: 191.433141,
height: 12.546,
transform: Array[6],
fontName: "g_d0_f2"
}
Is it possible to get somehow more information about g_d0_f2.
Upvotes: 3
Views: 3033
Reputation: 2691
Notice the PDF.js getTextContent will not and not suppose to match glyphs in PDFs. The PDF32000 specification has two different algorithms for text display and extraction. Even if you can lookup font data in the page.commonObjs, it might not be really helpful for extracted text content display due to glyphs encoding mismatch.
The page's getTextContent is doing text extraction and getOperatorList gets (glyph) display operators. See how src/display/svg.js renderer displays glyphs.
Upvotes: 1