Pekkala
Pekkala

Reputation: 21

Delphi XE7- Convert memo to UTF8

How can I convert my memo.text to UTF-8 and sent it to my e-mail via TIdMessage component? I used this function but it does not work properly...

function TForm1.EncodeAsUTF8(UnicodeStr: string): AnsiString;
var
  UTF8Str: UTF8String;
  i: Integer;
begin

  UTF8Str := UTF8String(UnicodeStr);

  SetLength(Result, Length(UTF8Str));

  for i := 1 to Length(UTF8Str) do
    Result[i] := AnsiChar(Ord(UTF8Str[i])); 

end;

Upvotes: 1

Views: 2224

Answers (1)

Remy Lebeau
Remy Lebeau

Reputation: 597941

Your function is not letting the RTL know that the AnsiString is UTF-8 encoded. Thus data loss may occur when the AnsiString is assigned to other strings after the function exits.

If you absolutely need to return a UTF-8 encoded AnsiString (which I do not recommend), then you have to ensure its metadata states the characters are using UTF-8, eg:

function TForm1.EncodeAsUTF8(UnicodeStr: string): AnsiString;
var
  UTF8Str: UTF8String;
begin
  UTF8Str := UTF8String(UnicodeStr);
  SetString(Result, PAnsiChar(UTF8Str), Length(UTF8Str));
  SetCodePage(PRawByteString(@Result)^, CP_UTF8, False);
end;

Alternatively:

function TForm1.EncodeAsUTF8(UnicodeStr: string): AnsiString;
begin
  PUTF8String(@Result)^ := UnicodeStr;
end;

However, it would be much simpler to just return a UTF8String instead and let the RTL handle the UTF-8 for you, eg:

function TForm1.EncodeAsUTF8(UnicodeStr: string): UTF8String;
begin
  Result := UnicodeStr;
end;

Or, at least return a UTF-8 encoded RawByteString instead, eg:

function TForm1.EncodeAsUTF8(UnicodeStr: string): RawByteString;
begin
  Result := UTF8String(UnicodeStr);
end;

UPDATE: That being said, TIdMessage is an Indy component, and Indy operates on normal String values. In Unicode versions of Delphi (and FPC), Indy will handle the UTF-8 encoding for you when preparing the email for transmission. Simply set the TIdMessage.Body to hold your Memo's normal Unicode text, and set the TIdMessage.CharSet to 'utf-8', eg:

MailMessage.Body := Memo.Lines;
// or: MailMessage.Body.Text := Memo.Text;
MailMessage.CharSet := 'utf-8';

That is all you need. You don't have to encode the Memo text to UTF-8 manually at all.

Only in non-Unicode versions of Delphi (and FPC) would it make sense to use your EncodeAsUTF8() function. The TIdMessage.CharSet property would still need to be set so the email headers claim UTF-8, but Indy will send the AnsiString bytes as-is and not re-encode them, so you would be responsible for ensuring the AnsiString is using UTF-8, eg:

function TForm1.EncodeAsUTF8(UnicodeStr: string): AnsiString;
begin
  Result := UTF8Encode(UnicodeStr);
end;

...

MailMessage.Body.Text := EncodeAsUTF8(Memo.Text);
MailMessage.CharSet := 'utf-8';

Upvotes: 3

Related Questions