clstaudt
clstaudt

Reputation: 22438

How to handle special characters in markdown?

I am just discovering Markdown and MultiMarkdown and I am loving it so far. However, special characters are not properly escaped when exporting to HTML and come out as garbage in the browser.

Example:

How does Markdown handle special characters?
============================================

For example, German is full of ä, ö, ü and ß.

is converted to

<h1 id="howdoesmarkdownhandlespecialcharacters">How does Markdown handle special characters?</h1>

<p>For example, German is full of ä, ö, ü and ß.</p>

Since I have to write in German a lot, entering the escape sequences by hand is not an option. How can I get HTML output with properly escaped special characters?

Upvotes: 19

Views: 35772

Answers (3)

Gael Lorieul
Gael Lorieul

Reputation: 3246

What you want is to tell the browser to use UTF-8 encoding, in which case those "special" characters will be displayed correctly. UTF-8 can be enforced by adding the <meta charset="UTF-8"> tag in the page's <head> section.

<!DOCTYPE html>
<html>

<head>
<meta charset="UTF-8">
<title>Title of the document</title>
</head>

<body>
<h1 id="howdoesmarkdownhandlespecialcharacters">How does Markdown handle special characters?</h1>
<p>For example, German is full of ä, ö, ü and ß.</p>
</body>
</html>

The charset information is

  • Either introduced by your markdown to html converter directly: for instance, that's what pandoc does when one invokes
    pandoc -o index.html index.md --standalone
    where index.md contains OP's original markdown code,
  • Or you can enter the <meta charset="UTF-8"> tag manually after generation of the *.html file,
  • Or the markdown to html converter you use might provide an option for injecting content (in our case the <meta> tag) in the <head> section. In the case of pandoc that would be the option -H aka --include-in-header, although that is unnecessary because pandoc specifies the UTF-8 charset by default.

Upvotes: 7

JustAnotherCoder
JustAnotherCoder

Reputation: 153

I do not know if this scenario apply to you, but here goes:

I have the same need regarding the Norwegian letters 'æ', 'ø' and 'å'. I use FireFox and the add-on 'Markdown Viewer' to view markdown documents.

Viewing a Norwegian document in Markdown Viewer will render garbled letters if the document is saved in the ordinary manner.

Saving the document using western (windows 1252) encoding renders the text just fine (I also tried with your German letters).

Upvotes: 6

SphinxMan
SphinxMan

Reputation: 3744

As far as I know, this is not possible (though I would be happy to be proven wrong). I have recently been generating documentation in Doxygen using Markdown syntax and have had to replace all ° symbols with &deg;, which is a shame as it goes against the philosophy of Markdown, which is to make the text files as readable as the generated output.

Upvotes: 6

Related Questions