Захар Joe
Захар Joe

Reputation: 693

Weird characters in a Microsoft Word document won't export/can't be searched

I have a document which has been sloppily authored. It's a dictionary that contains cyrillic characters. Most of the dictionary is manageable, but I'm stuck with one thing I need help with. Words have accented letters in them and they're mostly formatted properly as a letter with a unicode accent (thus forming a single letter). However there are some very peculiar letters that look similar for example to: a;´ (where "a" is any arbitrary cyrillic letter). You'd expect á in its place. However it wouldn't be a problem per se if only this thing could be exported to, say HTML and manipulated in a text editor. The problem is that Word treats this "thing" as a single character/entity and

At this point I'm trying to:

Here's a sample Word file.

Here's a screenshot of the word/letter in question:

enter image description here

which when typed correctly should appear like "скре́пка".

Upvotes: 3

Views: 973

Answers (2)

Jukka K. Korpela
Jukka K. Korpela

Reputation: 201508

Assuming that @Anonimista’s analysis is correct, as I think it is, you could fix the file by running some search and replace operations in Word, replacing e.g. ^19eq \o(е;´)^21 by е́ (the latter is Cyrillic letter е followed by combining acute accent U+0301). This is dull because you would need to do this for each vowel separately (and for uppercase vowels too). But I cannot find a way to use wildcards in this context; the codes ^19 and ^21 for start and end of field work only when wildcards are not enabled.

Upvotes: 0

Anonimista
Anonimista

Reputation: 752

The 'character' appears to be a Word field of type 'eq' (equation). Here is the field with toggled field codes:

enter image description here

If it is a large document you could try to create a VBA routine that removes the fields and replaces them with corresponding characters.

Upvotes: 1

Related Questions