Šime Vidas
Šime Vidas

Reputation: 186033

How to prevent Markdown from wrapping the generated HTML in a <p> element?

Update: The bounty is for a solution using the “marked” library.


This Markdown code:

*foo*

will produce this HTML code:

<p><em>foo</em></p>

Live demo: https://jsbin.com/luganot/edit?js,console

However, I'm already injecting the generated HTML code into an inline context, like so:

<p> text [inject generated HTML here] text </p>

so I don't want the <p> element to wrap around the generated HTML code. I just want the * delimiters to be converted to an <em>, element, and so on.

Is there a way to tell the Markdown converter to not produce the <p> wrapper? Currently, I'm doing a .slice(3,-4) on the generated HTML string, which does remove the <p>, and </p> tags, but this is obviously not a solution I'd like to keep for the long-term.

Upvotes: 21

Views: 4466

Answers (5)

SpaghettiCode4Life
SpaghettiCode4Life

Reputation: 11

The solutions above didn't work for me.

I passed the content as inline with parsedInline, this remove the <p> tag.

See in 'marked' documentation:

const blockHtml = marked.parse('**strong** _em_');
console.log(blockHtml); // '<p><strong>strong</strong> <em>em</em></p>'

const inlineHtml = marked.parseInline('**strong** _em_');
console.log(inlineHtml); // '<strong>strong</strong> <em>em</em>'

and then put my html element as parent:

<label for={key}>{@html marked.parseInline(value)}</label>

Upvotes: 1

thisizkp
thisizkp

Reputation: 1817

You can skip the block-lexing part and use inlineLexer instead.

html = marked.inlineLexer(markdown, [], options);

//example
marked.inlineLexer('*foo*', []) // will output '<em>foo</em>'

Upvotes: 8

zzzzBov
zzzzBov

Reputation: 179226

If you follow the commonmark standard, there isn't an official way to remove unwanted elements from the markup that markdown would otherwise generate. In 2014 I asked about the possibility of an inline mode, but that didn't really generate much activity and I never followed up with it to make it a reality.

With that said, the simplest solution I know of to sanitize markdown is to run it through a whitelist as a post-processing step.

Simply stripping <p> tags probably isn't enough because it would be relatively easy to accidentally add # characters and end up with stray h1-6 tags, or have inline <div> elements which aren't allowed in <p> elements.

Whitelisting is pretty straightforward in JS as long as you're in a browser context or using a similar DOM API.

This example takes the output from marked and generates a document fragment. The nodes in the fragment are then filtered based on whether they're phrasing content (which are the only nodes that <p> elements may contain). After filtering, the resultant nodes are then returned so that they may be used in the DOM.

const phrasingContent = [
  '#text', 'a', 'abbr', 'area', 'audio', 'b', 'bdi', 'bdo', 'br', 'button',
  'canvas', 'cite', 'code', 'data', 'datalist', 'del', 'dfn', 'em', 'embed',
  'i', 'iframe', 'img', 'input', 'ins', 'kbd', 'keygen', 'label', 'map', 'mark',
  'math', 'meter', 'noscript', 'object', 'output', 'picture', 'progress', 'q',
  'ruby', 's', 'samp', 'script', 'select', 'small', 'span', 'strong', 'sub',
  'sup', 'svg', 'template', 'textarea', 'time', 'u', 'var', 'video', 'wbr'
]

function sanitize(text) {
  const t = document.createElement('template')
  t.innerHTML = text
  whitelist(t.content, phrasingContent)
  return t.content
}

function whitelist(parent, names) {
  for (const node of parent.childNodes) {
    whitelist(node, names)
  
    if (!names.includes(node.nodeName.toLowerCase())) {
      unwrap(node)
    }
  }
}

function unwrap(node) {
  const parent = node.parentNode
  while (node.firstChild) {
    parent.insertBefore(node.firstChild, node)
  }
  parent.removeChild(node)
}

function empty(node) {
  while (node.firstChild) {
    node.removeChild(node.firstChild)
  }
}

const form = document.querySelector('form')
const input = document.querySelector('textarea')
const output = document.querySelector('output')

form.addEventListener('submit', e => {
  e.preventDefault()
  
  empty(output)
  
  output.appendChild(sanitize(marked(input.value)))
}, false)
<script src="https://cdnjs.cloudflare.com/ajax/libs/marked/0.3.6/marked.min.js"></script>
<form>
  <p>
    <textarea name="input" cols="30" rows="10">*foo*</textarea>
  </p>
  <button type="submit">Test</button>
</form>

<p> text <output></output> text </p>

Of course, all of this assumes a browser environment, and that whitelisting may be handled after the input is processed through the marked library.

Upvotes: 0

Wim Mostmans
Wim Mostmans

Reputation: 3601

I was searching for a solution for this too when I found this SO thread. I didn't find any good solution yet here so I've written my own.

var markdown = new Showdown.converter().makeHtml( '*foo*' );
console.log(markdown.replace(/^<p>|<\/p>$/g, ''));

Upvotes: 3

moonwave99
moonwave99

Reputation: 22810

Would using jQuery be an option? This would work in case:

var $text = $(new Showdown.converter().makeHtml( '*foo*' ) );
console.log( $text.html() );

Upvotes: 3

Related Questions