programmernovice
programmernovice

Reputation: 3941

Relation between .NET Encoding and Characterset

What's relation between CharacterSet here:
http://msdn.microsoft.com/en-us/library/ms709353(VS.85).aspx
and ascii encoding here:
http://msdn.microsoft.com/en-us/library/system.text.asciiencoding.getbytes(VS.71).aspx

Upvotes: 4

Views: 6452

Answers (5)

Lane
Lane

Reputation: 2719

I've compiled my own reference in order to switch between the two:

Windows code page       Name            System.Text.Encoding    schema.ini CharacterSet
20127                   ASCII (US)      ASCII                   20127
1252                    ANSI Latin I    Default                 ANSI
65001                   UTF-8           UTF8                    65001
1200                    UTF-16 LE       Unicode                 Unicode
1201                    UTF-16 BE       BigEndianUnicode        1201

Upvotes: 0

Hans Passant
Hans Passant

Reputation: 941227

This is really, really ancient. ODBC dates from the stone age, back when Windows starting taking over from MS-DOS. Back then, lots of text was still encoded in the original IBM-PC character set, named the "OEM Character Set" by Microsoft. The standard IBM-PC set had some accented characters and pseudo graphics glyphs in the upper half, codes 0x80-0xff.

Too limited for text output in non-English languages, Microsoft started using code pages, ranges of character glyphs suitable for a certain language group. The American English set of characters were standardized by ANSI, that label is now attached (incorrectly) to any non-OEM code page.

Nobody encodes text in the OEM character set anymore, it went the way of the dodo at least 10 years ago. The proper setting here is ANSI. And keeping your fingers crossed behind your back that the code page used to encode the text matches your system's default code page. That's dodo too, Unicode solved it.

Upvotes: 2

to StackOverflow
to StackOverflow

Reputation: 124696

ANSI is the current Windows ANSI code page, equivalent to Encoding.Default.

OEM is the current OEM code page typically used by console applications.

You can get this using:

Encoding.GetEncoding(CultureInfo.CurrentCulture.TextInfo.OEMCodePage)

In a console application, the OEM encoding will also be available using

Console.OutputEncoding

Upvotes: 11

devio
devio

Reputation: 37205

From my understanding, CharacterSet=ANSI is equivalent to Encoding.Default. OEM might be ASCIIEncoding then.

However, ANSI uses the system ANSI code page, so incompatibilities may arise if the same file is accessed from computers with different code pages.

Upvotes: 0

o.k.w
o.k.w

Reputation: 25790

The short answer to your question, there's no direct relation.

The longer version:
CharacterSet for the "Schema.ini" file can be either ANSI or OEM.
ANSI and ASCII refer to different thing.

You can read more of it here:
Understanding ASCII and ANSI Characters
ASCII vs ANSI Encoding by Alex Hoffman

Upvotes: 1

Related Questions