UnkwnTech
UnkwnTech

Reputation: 90951

How can I convert unicode characters to ascii codes in delphi 7?

Yes we're talking about ASCII codes. My appologies I'm not the Delphi dev here.

Upvotes: 4

Views: 42091

Answers (7)

lkessler
lkessler

Reputation: 20132

For Delphi 7, I'd get the free Unicode Library by Mike Lischke who is the author of Virtual Treeview.

The libary includes a lot of conversion functions to go to and from Unicode, so you can use the ones that make most sense in your application.

Or you can upgrade to Delphi 2009 which has built-in encoding routines, and its own library of conversion functions.

Upvotes: 6

Eugene Yokota
Eugene Yokota

Reputation: 95684

Let's get a few things straight. Character set (charset) and character encodings are two related but different concepts. A character set is an abstract list of characters with some sort of integer character code associated. Then there are character encodings, which is basically an algorithm that describes how the characters are represented in bytes.

ASCII acts as both the character set and encoding. It uses 7 bits to express 128 characters (94 printable). Unicode on the other hand is a character set, expressing 1,114,112 code points. There are several encodings to represent Unicode strings but most notable ones are UTF-8, UTF-16, UTF-16LE, and UTF-32. In other words, a single Unicode character can be represented in different ways depending on the encodings.

How can I convert unicode characters to ascii codes in delphi 7?

I think the question could be interpreted in two ways.

  1. I have a Unicode string in some encoding that only includes ASCII printable characters. How can I convert the string into a byte array of ASCII encoding?

  2. I have a Unicode string in some encoding that also includes non-ASCII printable characters such as Chinese characters. How can I encode the string into a ASCII encoding without losing information, and later decode it back to the original Unicode string?

If you mean the first, you can load the Unicode string into WideString like Osman is saying and do

var
  original: WideString;
  s: AnsiString;
begin
  s := AnsiString(original);

If you mean the second, you would need a generic encoding algorithm like Base64 encoding. You can use DCPBase64.pas included in David Barton's DCPcrypt v2 Beta 3.

Upvotes: 3

Osman
Osman

Reputation:

You can use the function in http://swissdelphicenter.ch/en/showcode.php?id=1692
It converts Unicode string to Ansi string using specified code page.
If you want convert using default system codepage (defined in regional options as non-unicode codepage) you can do it simply like following:

var
  ws: widestring;
  s: string;
begin
  s:=string(ws)

Upvotes: 1

Constantin
Constantin

Reputation: 28204

See related questions on converting from Unicode to ASCII:

In general, character set of hundreds thousands entries cannot be converted to character set of 127 entries without some loss of information or encoding scheme.

Upvotes: 1

Steve
Steve

Reputation: 6480

As an example, the letter A is represented in unicode as U+0041 and in ansi as just 41. So converting that would be pretty simple, but you must find out how the unicode character is encoded. The most common are UTF-16 and UTF-8. UTF 16, is basically two bytes per character, but even that is an oversimplification, as a character may have more bytes. UTF-8 sounds as if it means 1 byte per character but can be 2 or 3. To further complicate matters, UTF-16 can be little endian or big endian. (U+0041 or U+4100).

Where your question makes no sense is if you wanted to for example convert the arabic letter ain U+0639 to ansi on an English locale. You can't.

Upvotes: 1

Rob Kennedy
Rob Kennedy

Reputation: 163357

"ASCII" is the name of a specific mapping of characters to numbers, but some people say "ASCII code" when they don't really mean ASCII at all; they just want the numeric value of a character, whatever mapping is in effect at the time. Does that description apply to you?

If so, then you can use the Ord standard function to get the Unicode code-point value of whatever Unicode character you have.

var
  wc: WideChar;
  ws: WideString;
  x: Word;

x := Ord(wc);
x := Ord(ws[1]);

If you really meant ASCII, though, then you'll have to be more specific about what sort of conversion you have in mind.

Upvotes: 1

Toon Krijthe
Toon Krijthe

Reputation: 53476

It depends what your definition of conversion is. If you want to map the 127 lowest characters to the Unicode equivalent, you can use an explicit cast. But this creates garbage if the string contains higher characters.

If you want mappings like ë -> e and û -> u, you can write your own code. But be aware that there are always characters that can't be converted.

Upvotes: 1

Related Questions