Andre Pena
Andre Pena

Reputation: 59396

MarkdownSharp/Markdown.NET: How to retrieve non-formatted text from markdown?

Sometimes it may be useful to transform markdown to pure text (for sending in e-mail for instance).

Does any of these libraries support this functionality? (I'm actually more insterested in MarkdownSharp)

EDIT

Responding to Jorn's comment. I'll clarify what I expect from this kind of conversion:
Markdown has special characters that, depending on context, only have formatting meaning. The **,=,- characters for instance. It would be nice if I could clear the text from formatting characters.

I'm not sure what would be the best approach and what characters should be eliminated, nor I know what to do with links for instance, but I think someone might have done something in this sense before.

EDIT 2

Found a good example: Stackoverflow uses this kind of markdown clearing in the "Questions" list. I'm quite sure it clears the markdown formatting before rendering the question content brief, otherwise it would contain line breaks, strongs, H1s and so forth.

EDIT 3

I agree to John. The best solution seems to be to convert from markdown to HTML and then strip the resulting HTML.

And this task seems to be already solved: How Can I strip HTML from Text in .NET?

Upvotes: 4

Views: 1741

Answers (1)

John Feminella
John Feminella

Reputation: 311645

If you just want to retain the original text, then simply don't pass it to Markdown.

Markdown is for one thing only: turning Markdown-formatted text into HTML. If you want Markdown to format it in something other than HTML with a different set of transformation rules, then alas, you'll have to write your own transformer.

If you want to get the "text-only" version of already-HTML-formatted Markdown, you can just strip the HTML tags. This is what StackOverflow does.

Upvotes: 2

Related Questions