Reputation: 582
I have PDF document with Ukrainian text (Cyrillic letters). But when I copy and paste it into some input field, I get something next:
ȿɄɈɇɈɆȱɄɈ-ɋɌȺɌɂɋɌɂɑɇɂɃ ȺɇȺɅȱɁ ȼɂȻȱɊɄɈȼɈȽɈ
No one text detection or converter didn't help me.
What is it and how to copy normal Ukrainian text?
Upvotes: 0
Views: 133
Reputation: 5707
The PDF has likely been created with an embedded font subset and no toUnicode
mapping. Basically the codes of the characters used in the content of the PDF are mapped to glyphs embedded in the PDF which are displayed, but there is no mapping from these codes to regular Unicode codes so copying them produces gibberish. The only way to extract the original contents would be with some form of OCR.
Upvotes: 1