user1897277
user1897277

Reputation: 495

Handling AnsiString and its Hex code in Unicode aware Delphi versions

I am running a legacy application, built on Delphi2007, where we used to handle non-English characters by storing 2byte Hex code of the character in the DB. While reading we apply char() to convert these Hex code to String.

String to Hex (before saving to DB):

strHex := Format( '%x', [ Byte( strText[ lIndex ] ) shr 4 ] );
DataStr[ lPos ] := strHex[ 1 ];
inc( lPos );

strHex := Format( '%x', [ Byte( strText[ lIndex ] ) and $0F ] );
DataStr[ lPos ] := strHex[ 1 ];
inc( lPos );

//in simple I am saving the Hex code to pcData

Hex to String (after reading from DB):

strText := strText + Chr( StrToInt('$'+ DataStr[lPos] + DataStr[lPos + 1]))

This code started breaking after moving to Delphi XE7, where string is treated as UniCode String, we explicitly have to convert the string to AnsiString type.

Converting below string to hex
ТуцЕфылАшдеук8311
In Delphi 2007 gives:
\D2\F3\F6\C5\F4\FB\EB\C0\F8\E4\E5\F3\EA8311
In Delphi XE7 gives:
\22\43\46\1A\33\4B\4B\48\44\42\14\44\49\33\351522


I would like to know the best way I can modify this code such that I can handle my legacy data.

Upvotes: 2

Views: 1305

Answers (2)

David Heffernan
David Heffernan

Reputation: 612964

According to comments, you just need to decode this data to a native Unicode string. Do that like so:

  1. Read the encoded text from the database into a string variable.
  2. Decode that text into a byte array rather than a string. Your Delphi 2007 code can be used pretty much as it, but it needs to write to a byte array rather than a string.
  3. That byte array is ANSI 1251 encoded. Decode it with TEncoding.GetString. You'll need to create an instance of the TEncoding class with the correct code page, Encoding := TEncoding.GetEncoding(1251).

Upvotes: 1

Remy Lebeau
Remy Lebeau

Reputation: 596256

First, the simpler way to generate the hex string would have been to use the RTL's own BinToHex() function instead of writing your own conversion code, eg:

var
  ...
  s: AnsiString;
  DataStr: string; 
  lPos: Integer;
  ...
begin
  ...
  s := '...';
  BinToHex(PAnsiChar(s), @DataStr[lPos], Length(s)); 
  Inc(lPos, Length(s)*2);
  ...
end;

Then, you can use HexToBin() to reverse it. And since you are dealing with encoded ANSI data, you can declare an AnsiString variable that has an affinity for the desired codepage encoding (in your case, probably 1251), read the hex code directly into that variable, and then assign/cast it to a normal String and let the RTL handle the conversion to Unicode for you:

type
  Win1251String = type AnsiString(1251);
var
  ...
  tmp: Win1251String;
  DataStr, strText: string;
  lPos: Integer;
  ...
begin
  ...
  SetLength(tmp, LengthOfHex div 2);
  HexToBin(@DataStr[lPos], PAnsiChar(tmp), Length(tmp));
  strText := String(tmp);
  ...
end;

Upvotes: 2

Related Questions