John Thomas
John Thomas

Reputation: 4135

How to send an email containing Greek characters using TIdMessage and Delphi XE *UPDATED*

We want to send through email, using D-XE and Indy's TIdMessage component the following htm file as body:

<html>

<head>
<meta http-equiv=Content-Type content="text/html; charset=windows-1253">
<meta name=Generator content="Microsoft Word 12 (filtered)">
<style>
<!--
 /* Font Definitions */
 @font-face
    {font-family:"Cambria Math";
    panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
    {font-family:Tahoma;
    panose-1:2 11 6 4 3 5 4 4 2 4;}
 /* Style Definitions */
 p.MsoNormal, li.MsoNormal, div.MsoNormal
    {margin:0cm;
    margin-bottom:.0001pt;
    font-size:12.0pt;
    font-family:"Times New Roman","serif";
    color:black;}
.MsoChpDefault
    {font-size:10.0pt;}
@page Section1
    {size:595.3pt 841.9pt;
    margin:72.0pt 90.0pt 72.0pt 90.0pt;}
div.Section1
    {page:Section1;}
-->
</style>

</head>

<body bgcolor=white lang=EL>

<div class=Section1>

<p class=MsoNormal><span lang=EN-US style='font-family:"Tahoma","sans-serif"'>Abcd</span><span
lang=EN-US style='font-family:"Tahoma","sans-serif"'> </span><span
style='font-family:"Tahoma","sans-serif"'>αβγδ ά&#8118;&#8048;&#7938; </span></p>

</div>

</body>

</html>

(Ok, the actual file is different but the problem is the same).

In the above file, if you'll save it as temp.htm and load it in the Internet Explorer, you'll see 4 latin characters, 4 Greek characters without tone and 4 Greek characters with tone (variations of Alpha - the first letter of Greek alphabet). Something like this:

Abcd αβγδ άᾶὰἂ

So far, so good.

If we load the above file in the Body property of the TIdMessage and send it through email it shows like this:

Abcd ???? ?ᾶὰἂ

As you see, the greek letters from the monotonic alphabet are replaced with ???? ? - tested using Mozilla Thunderbird 3 on WinXP.

The properties of the TIdMessage component are as follows:

TIdMessage Properties

I tried to set the CharSet to Windows-1253 but no luck.

Any ideas how this can work?

UPDATE:

Answering your questions:

The raw message source after it was received is: (the email addresses were redacted)

From - Thu Sep 15 11:11:06 2011
X-Account-Key: account3
X-UIDL: 00007715
X-Mozilla-Status: 0001
X-Mozilla-Status2: 00400000
X-Mozilla-Keys:                                                                                 
Return-Path: [redacted]
X-Envelope-To: [redacted]
X-Spam-Status: No, hits=0.0 required=5.0
    tests=AWL: 0.194,BAYES_20: -0.73,HTML_MESSAGE: 0.001,
    MIME_HEADER_CTYPE_ONLY: 0.56,MIME_HTML_ONLY: 0.001,MISSING_MID: 0.001,
    CUSTOM_RULE_FROM: ALLOW,TOTAL_SCORE: 0.027,autolearn=no
X-Spam-Level: 
Received: from localhost ([127.0.0.1])
    by [redacted]
    for [redacted];
    Thu, 15 Sep 2011 11:10:59 +0300
From: [redacted]
Subject: Test msg
To: [redacted]
Content-Type: text/html; charset=us-ascii
Sender: [redacted]
Reply-To: [redacted]
Disposition-Notification-To: [redacted]
Return-Receipt-To: [redacted]
Date: Thu, 15 Sep 2011 11:10:59 +0300

<html>

<head>
<meta http-equiv=Content-Type content="text/html; charset=windows-1253">
<meta name=Generator content="Microsoft Word 12 (filtered)">
<style>
<!--
 /* Font Definitions */
 @font-face
    {font-family:"Cambria Math";
    panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
    {font-family:Tahoma;
    panose-1:2 11 6 4 3 5 4 4 2 4;}
 /* Style Definitions */
 p.MsoNormal, li.MsoNormal, div.MsoNormal
    {margin:0cm;
    margin-bottom:.0001pt;
    font-size:12.0pt;
    font-family:"Times New Roman","serif";
    color:black;}
.MsoChpDefault
    {font-size:10.0pt;}
@page Section1
    {size:595.3pt 841.9pt;
    margin:72.0pt 90.0pt 72.0pt 90.0pt;}
div.Section1
    {page:Section1;}
-->
</style>

</head>

<body bgcolor=white lang=EL>

<div class=Section1>

<p class=MsoNormal><span lang=EN-US style='font-family:"Tahoma","sans-serif"'>Abcd</span><span
lang=EN-US style='font-family:"Tahoma","sans-serif"'> </span><span
style='font-family:"Tahoma","sans-serif"'>???? ?&#8118;&#8048;&#7938; </span></p>

</div>

</body>

</html>

Mozilla Thunderbird says also Message Encoding: Western (ISO-8859-1). I tried to put in the IdMessage component different encodings like windows-1253 (Greek) or UTF-8 - the result was the same. Also, I tried to convert the htm file to UTF-8 (using the Notepad++) - it looked the same (I changed the charset by hand in the html's meta info). Sent the message again. The result: Abcd ???2?3?? ??ᾶὰἂ

Upvotes: 4

Views: 4614

Answers (3)

I use Indy 10 and Delphi XE2 (Unicode std Strings) setting Message CharSet to 'ISO-8859-7' and adding text to body using UTF8Encode

TempMess := TIdMessage.Create(self); TempMess.CharSet :='ISO-8859-7'; TempMess.Body.Add(UTF8Encode('Καλημέρα!!!'));

Upvotes: 0

Remy Lebeau
Remy Lebeau

Reputation: 595827

If you look at your own screenshots, you will see that TIdMessage and the transmitted email are both set to use US-ASCII as the CharSet. That is why your data is getting altered.

If you load the HTML into the TIdMessage.Body or TIdText.Body property, you have to decode the data to UTF-16 (since that is what the Body property uses in XE) and then set the TIdMessage.CharSet or TIdText.CharSet property to windows-1253 so the UTF-16 data gets re-encoded properly when the email is sent, eg:

Enc := CharsetToEncoding('windows-1253');
try
  IdMessage.Body.LoadFromFile('file.htm', Enc);
  IdMessage.ContentType := 'text/html';
  IdMessage.CharSet := 'windows-1253';
finally
  Enc.Free;
end;

Or:

Enc := CharsetToEncoding('windows-1253');
try
  with TIdText.Create(IdMessage.MessageParts, nil) do
  begin
    Body.LoadFromFile('file.htm', Enc);
    ContentType := 'text/html';
    CharSet := 'windows-1253';
  end;
finally
  Enc.Free;
end;

If you load the HTML into a TIdAttachment object instead, then you don't have to decode/encode anything manually, since the attachment data is sent as-is.

with TIdAttachmentFile.Create(IdMessage.MessageParts, 'file.htm') do
begin
  ContentType := 'text/html';
end;

Upvotes: 3

Mad Hatter
Mad Hatter

Reputation: 772

Try to set ContentTransferEncoding, for example to quoted-printable. Remember that mail still uses 7-bit charcters (unless a server advertise it can handle 8-bit or binary data), thereby a proper transfer encoding is needed.

Upvotes: 0

Related Questions