Reputation: 81
I have a piece of code in Delphi which encodes the ASN1 objectId (iod) to a character string (this string is used later in the rest of the program, and at the end of the day the program converts each character in the string to corresponding hexadecimal values for further processing).
This piece of code was written for Windows OS. Now I am trying to port it to Linux (Centos). I am using RAD Studio for compiling this code for both Windows and Linux platforms.
In Linux, the same program produces a different output. I thought that the difference in the output is caused by the character set or locale settings used in Windows and Linux. In Windows, I see the character set used is 'Windows-1252' and in my Linux machine the default locale is set to 'en_US.utf8'. So, in the hope of getting the same output in Windows and Linux, I added a new locale using the below commands/steps in Linux:
sudo localedef -v -c -i /usr/share/i18n/locales/en_US -f ./CP1252 en_US.CP1252
After this step, when I did this command:
list-locales | grep -i 1252
I got the below output:
en_US.cp1252
Now, I did the below command:
localectl set-locale LANG="en_US.cp1252"
After this step, when I run the 'locale' command, I see the below output
LANG=en_US.cp1252
LC_CTYPE="en_US.cp1252"
LC_NUMERIC="en_US.cp1252"
LC_TIME="en_US.cp1252"
LC_COLLATE="en_US.cp1252"
LC_MONETARY="en_US.cp1252"
LC_MESSAGES="en_US.cp1252"
LC_PAPER="en_US.cp1252"
LC_NAME="en_US.cp1252"
LC_ADDRESS="en_US.cp1252"
LC_TELEPHONE="en_US.cp1252"
LC_MEASUREMENT="en_US.cp1252"
LC_IDENTIFICATION="en_US.cp1252"
LC_ALL=
I logged out of my session and logged in again, and I see the same output as above for the 'locale' command in the new session as well.
I executed my Test program in Linux, hoping I will get the same output as in Windows, but to my surprise I am getting the same output as before changing the locale.
Can someone please help to point out what is missing here? Is the locale setting not taking effect, or is there anything wrong in what I am doing?
Below is the sample program that I am using, and the corresponding output in Windows and Linux.
program TestProj;
{$APPTYPE CONSOLE}
{$APPTYPE CONSOLE}
{$R *.res}
uses
SysUtils,System.Net.Socket;
function CutString(const Trenner : AnsiString; var s : AnsiString) : AnsiString;
var
i : integer;
begin
if s = '' then
CutString := ''
else begin
i := pos(Trenner, s);
if i = 0 then begin
CutString := s;
s := ''
end
else begin
CutString := copy(s, 1, i - 1);
delete(s, 1, i - 1 + length(Trenner))
end
end
end;
function CodeObjectId(oid : AnsiString) : AnsiString;
var
s : Ansistring;
i, n : integer;
testInt : integer;
testInt1 : integer;
testChar : AnsiChar;
testChar1 : AnsiChar;
begin
i := 0;
while oid <> '' do begin
inc(i);
n := StrToInt(CutString('.', oid));
if i = 1 then
s := ansichar(40 * n)
else if i = 2 then
s := ansichar(ord(s[1]) + n)
else begin
if n > $3fff then
s := s + ansichar($80 or ((n shr 14) and $7f)) + ansichar($80 or ((n shr 7) and $7f)) + ansichar(n and $7f)
else if n > $7f then begin
s := s + ansichar($80 or ((n shr 7) and $7f)) + ansichar(n and $7f);
testInt := ($80 or ((n shr 7) and $7f));
testChar := ansichar(testInt);
testInt1 := (n and $7f);
testChar1 := ansichar(testInt1);
writeln('testInt : ', testInt);
writeln('testChar : ', testChar);
writeln('testInt1 : ', testInt1);
writeln('testChar1 : ', testChar1);
end
else
s := s + ansichar(n and $7f)
end
end;
CodeObjectId := s
end;
var
formatSettings : TFormatSettings;
SysLocale : TSysLocale;
pLocale : PAnsiChar;
Locale, LocaleType: Integer;
DefaultLocale: string;
begin
try
writeln('-----------------------------------------------');
CodeObjectId('1.3.12.2.1107.3.66.3.1');
writeln('-----------------------------------------------');
except
on E: Exception do
Writeln(E.ClassName, ': ', E.Message);
end;
end.
Output In Windows:
testInt : 136
testChar : ^
testInt1 : 83
testChar1 : S
Output in Linux:
testInt : 136
testChar : ▒
testInt1 : 83
testChar1 : S
As I mentioned above, the encoded string is used later in the program to get the corresponding hexadecimal values for further processing. This hexadecimal value (the final output) is different as the encoded characters are different in Windows and Linux.
Upvotes: 1
Views: 398
Reputation: 9855
The algorithm seems to be the ASN.1 encoding of an object identifier (OID) which is a series of binary data bytes that do not necessarily form a valid string.
If your application requires the interpretation of the binary data as a string then this is an error.
See https://stackoverflow.com/a/5929189/10622916 or https://learn.microsoft.com/en-us/windows/win32/seccertenroll/about-object-identifier?redirectedfrom=MSDN
As you can see from the numeric values of testInt
and testInt1
, the binary result of the calculation is the same (at least for the two bytes shown in the question). Only the interpretation of the binary data as a string or as characters in the system's encoding seems to be different.
In my opinion the data type AnsiString
for the result of CodeObjectId
is wrong (or at least misleading). It should be a dynamic Array of Byte
instead. If you cannot change the data type you should at least change the interpretation of the data.
If you only want to compare the result of function CodeObjectId
on different systems, I suggest to print the character codes as hexadecimal bytes instead of printing the result as a string, e.g.
s := CodeObjectId('1.3.12.2.1107.3.66.3.1');
// writeln(s); // wrong interpretation as a string
for i := 1 to Length(s) do
write(IntToHex(Ord(s[i]),2), ' ');
writeln;
This prints
2B 0C 02 88 53 03 42 03 01
where the bytes correspond to the input as
2B
= 1.3 -> 1 * 40 + 3 = 430C
= 1202
= 288 53
= 110703
= 342
= 6603
= 301
= 1see https://onlinegdb.com/obsAGIGmJ for the full program
Upvotes: 4