Reputation:
So I modified some emails I send to get rid of images and replace them by special unicode characters. For example I had an arrow image and replaced it with ↗
while wrapping it in a <span>
to give it the color I want.
When I look at the source in Gmail (3 dots > Show Original) I see this:
...
--1234567890123456789012345678
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.=
w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns=3D"http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv=3D"Content-Type" content=3D"text/html; charset=3DUTF-8" />
</head>
<body>
...
... <span style=3D"font-family:arial,verdana;font-weight:bold;color:#209a20">↗</span> ...
...
</body>
</html>
--1234567890123456789012345678--
Which is what I'd expect since that's what I wrote in my code.
Now the problem is that it displays like this in the Gmail web interface:
What am I doing wrong? Isn't UTF-8 a unicode encoding that should support this character?
I would understand if some of these special characters are displayed as square boxes or something, but I do not understand how they can remain encoded while the
turns into a space correctly.
It also makes me question whether other email clients will display these correctly (would love feedback on that too).
Upvotes: 1
Views: 3357
Reputation: 5259
Try the HTML code rather than the HTML entity.
So ↗
for the north east arrow, as per
https://www.toptal.com/designers/htmlarrows/arrows/north-east-arrow/
Best reference for this is usually https://unicode-table.com/en/
Upvotes: 0
Reputation: 142433
In the 1950's computers could handle only capital letters, digits and some punctuation.
Before 1970, EBCDIC was invented (only to later die out) for handling lower case and a few more punctuation characters.
Then came a plethora of encodings to handle European accents, Cyrillic, Greek, and eventually Chinese. (There are some interesting stories on the invention of typewriters for handling Chinese!)
Eventually, the Unicode group got together and slowly created a universal standard. It has been evolving for a few decades and continues to enhance it -- emojis are a big addition that is ongoing.
But, meanwhile, how does one put Emoji, etc, in URLs, type them on a keyboard, etc, etc? Those standards are lagging way behind. So, there are kludges in place.
↗
for that arrow.%E2%86%97
.\U8599
which is based on the decimal value of the "codepoint". (I think Java goes that direction.)UNHEX('E28697')
I don't know of anything other than HTML that reacts favorably to ↗
Ever notice a +
in a URL? That is the encoding for a single space. (Also %20
works there.)
Upvotes: 0