Reputation: 1047
since it's possible to have Umlaute (e.g. öäü) in the local part of an email address I need to convert them to ascii because Zend-Mail is not able to handle it - it always throws invalid header exception.
So there is this php-function idn_to_ascii which converts domain names to IDNA ASCII format. The problem is that I'm not sure how to use it correctly.
Let's take this email address: testö@domain.com
// doesn't work (unknown error):
idn_to_ascii('testö@domain.com') --> [email protected]
If I just convert the local part of the email address it seems to work:
idn_to_ascii('testö') --> [email protected]
But what if also the domain part contains Umlaute?
e.g. testö@domainö.com
should I do something like this?
idn_to_ascii('testö').'@'.idn_to_ascii('domainö.com')
Also on the php-homepage someone wrote a comment that you have to skip the high-level domain part (and the official documentation is wrong): see here
idn_to_ascii('domainö') // right
idn_to_ascii('domainö.com') // wrong
I'm so confused now :|
Someone has experience in that? And the worst thing is: I can't even test it because I don't have an email address with Umlaute.
Upvotes: 5
Views: 1653
Reputation: 98921
As of 06 December 2022, testö@domain.com
is not a valid email address because the local part (testö
) can only contain the following ASCII characters:
References:
The exact rule is that any ASCII character, including control
characters, may appear quoted, or in a quoted string. When quoting
is needed, the backslash character is used to quote the following
character. For exampleAbc\@[email protected]
is a valid form of an email address. Blank spaces may also appear, as in
Fred\ [email protected]
The backslash character may also be used to quote itself, e.g.,
Joe.\\[email protected]
In addition to quoting using the backslash character, conventional double-quote characters may be used to surround strings. For example
"Abc@def"@example.com "Fred Bloggs"@example.com
are alternate forms of the first two examples above. These quoted forms are rarely recommended, and are uncommon in practice, but, as
discussed above, must be supported by applications that are
processing email addresses. In particular, the quoted forms often
appear in the context of addresses associated with transitions from
other systems and contexts; those transitional requirements do still
arise and, since a system that accepts a user-provided email address
cannot "know" whether that address is associated with a legacy
system, the address forms must be accepted and passed into the email
environment.Without quotes, local-parts may consist of any combination of
alphabetic characters, digits, or any of the special characters! # $ % & ' * + - / = ? ^ _ ` . { | } ~
period (".") may also appear, but may not be used to start or end the local part, nor may two or more consecutive periods appear. Stated differently, any ASCII graphic (printing) character other than the at-sign ("@"), backslash, double quote, comma, or square brackets may appear without quoting. If any of that list of excluded characters are to appear, they must be quoted. Forms such as
[email protected]
Upvotes: 2
Reputation: 91
Something a bit more simple:
function email_to_ascii($email) {
$explode = explode('@', $email);
return $explode[0].'@'.idn_to_ascii($explode[1]);
}
Upvotes: 1
Reputation: 1
Try something like this:
function emailToAscii($email) {
$explodedMail = explode('@', $email);
$mailName = idn_to_ascii(array_first($explodedMail));
$mailDomain = last($explodedMail);
$explodedDomain = explode('.', $mailDomain);
$secondLvlDomain = idn_to_ascii(array_first($explodedDomain));
$firstLvlDomain = idn_to_ascii(last($explodedDomain));
return "$mailName@$secondLvlDomain.$firstLvlDomain";
}
Upvotes: 0