Reputation: 361
I have 2 functions, one to encrypt and the other to de-crypt a string, that use the Ord() function. It works great except with extended Ascii codes.
If I use the letter ê (Ascii code 136), the Ord() function returns 234 where as I expected it to return 136.
If I run decrypt on the encrypted string, I get a different result than what the original string was, the ê turns into a j.
Can somebody please help on how to solve this?
procedure TForm1.btnEncryptClick(Sender: TObject);
var
sTempString : string;
iIndex,
i: integer;
begin
sTempString := edtOriginalString.Text ;
for iIndex := 1 to length(sTempString) do
begin
i := ord(sTempString[iIndex]);
i := i shl 1;
sTempString[iIndex] := Char(i);
end;
edtEncryptedString.Text := sTempString;
end;
procedure TForm1.btnDecryptClick(Sender: TObject);
var
sTempString : string;
iIndex,
i : integer;
begin
sTempString := edtEncryptedString.Text ;
for iIndex := 1 to length(sTempString) do
begin
i := ord(sTempString[iIndex]);
i := i shr 1;
sTempString[iIndex] := char(i);
end;
edtDecryptedString.Text := sTempString;
end;
Upvotes: 0
Views: 1062
Reputation: 109003
If I use the letter ê (Ascii code 136)
No, that's actually wrong. ASCII only has 128 characters (0 to 127).
However, ê is the Unicode character U+00EA: LATIN SMALL LETTER E WITH CIRCUMFLEX.
And EA (hex) is indeed 234 (dec).
Delphi characters and strings are 8-bit before Delphi 2009, and Unicode in Delphi 2009 and later.
So in your case, Delphi 6, a character is 8-bit.
Hence, your left shift will make you lose the most significant bit (MSB), and you cannot possibly hope to get it back.
Indeed, if we take the case of ê (234), we have
1110 1010 (ê)
Shifting the bits one step to the left, we obtain
1101 0100
Shifting the bits one step to the right, we obtain
0110 1010 (j).
Hence, we lost information.
However, your method will work for ASCII characters (<= 127), because they all have zero as the MSB. It will not work for any characters above 127, because they all have one as the MSB (so it wouldn't work even if Ord
did indeed return 136 in your case).
Hence, you need to abandon or redesign your "encryption" method if you want to support characters above 127. For instance, you could rotate the bits instead of shifting them. Or you could invert them (using not
).
If you choose to rotate instead of shift, you will get this:
1110 1010 (ê)
rotate left:
1101 0101
rotate back (right):
1110 1010 (ê)
Although it isn't relevant to your actual issue, you might still wonder why Ord
doesn't return 136 as you'd expect.
Well, before Unicode (mainly in the 1990s and earlier), there simply were many different (non-compatible) character encodings. Often, an 8-bit encoding/codepage (characters 0..255) included the ASCII characters (0..127) and then made its own choices for the remaining characters (128..255). Since ê isn't an ASCII character, this means that only some of these "extended ASCII" codepages might include ê, and among those that do include ê, the actual numeric values might very well differ.
In other words, your source claiming that ê is 136 and your Delphi program are using different 8-bit codepages.
In the modern world of Unicode, this kind of problem no longer exists.
Upvotes: 3