Which encoding failure did encode "vóór" into "v3/43/4r"?

Question

A while ago, I saw the text "v3/43/4r" in a document.

I know it comes from "vóór" (the acute accent emphasises in Dutch), and wonder which encoding failure was applied to get this wrong.

rodrigo · Accepted Answer

Some time ago I've written a program that semi-automatically makes this analysis (maybe I'll publish it some time...) and here it is the result, with a bit of imagination:

ó: is U+00F3, and occupies the same codepoint (0xF3) in a lot of different encodings (most ISO-8859-* and most western Windows-*).
In CP850 the codepint 0xF3 is ¾ (U+00BE), that is the three-quarters character. It is the same in other, less used, codepages (CP775, CP856, CP857, CP858).
The ¾ is sometimes transliterated to 3/4 when the character is not directly available.

And there you are! "vóór" -> "v¾¾r" -> "v3/43/4r".

The first part (ó -> ¾) is the usual corruption of ANSI vs. OEM codepages in the Western Windows versions (in my country ANSI=Windows-1252, OEM=CP850). You can see it easily creating a file with NOTEPAD, writing vóór and dumping it in a command prompt with type.

Which encoding failure did encode "vóór" into "v3/43/4r"?

Answers (1)

Related Questions

Which encoding failure did encode &quot;v&#243;&#243;r&quot; into &quot;v3/43/4r&quot;?

Answers (1)

Related Questions

Which encoding failure did encode "vóór" into "v3/43/4r"?