Sudhir Jonathan
Sudhir Jonathan

Reputation: 17526

Translate github flavored markdown regex from ruby to python

I'm trying to get an implementation of github flavored markdown working in python, with no luck... I don't have much in the way of regex skills.

Here's the ruby code from github:

# in very clear cases, let newlines become <br /> tags
text.gsub!(/(\A|^$\n)(^\w[^\n]*\n)(^\w[^\n]*$)+/m) do |x|
  x.gsub(/^(.+)$/, "\\1  ")
end

And here's what I've come up with so far in python 2.5:

def newline_callback(matchobj):
    return re.sub(r'^(.+)$','\1 ',matchobj.group(0))     
text = re.sub(r'(\A|^$\n)(^\w[^\n]*\n)(^\w[^\n]*$)+', newline_callback, text)

There just doesn't seem to be any effect at all :-/

If anyone has a fully working implementation of github flavored markdown in python, other than this one (doesn't seem to work for newlines), I'd love to hear about it. I'm really most concerned about the newlines.

These are the tests for the regex, from github's ruby code:

>>> gfm_pre_filter('apple\\npear\\norange\\n\\nruby\\npython\\nerlang')
'apple  \\npear  \\norange\\n\\nruby  \\npython  \\nerlang'
>>> gfm_pre_filter('test \\n\\n\\n something')
'test \\n\\n\\n something'
>>> gfm_pre_filter('# foo\\n# bar')
'# foo\\n# bar'
>>> gfm_pre_filter('* foo\\n* bar')
'* foo\\n* bar'

Upvotes: 0

Views: 455

Answers (2)

Tonttu
Tonttu

Reputation: 1821

That Ruby version has multiline modifier in the regex, so you need to do the same in python:

def newline_callback(matchobj):
    return re.sub(re.compile(r'^(.+)$', re.M),r'\1  ',matchobj.group(0))     

text = re.sub(re.compile(r'(\A|^$\n)(^\w[^\n]*\n)(^\w[^\n]*$)+', re.M), newline_callback, text)

So that code will (like the Ruby version) add two spaces after before newline, except if we have two newlines (paragraph).

Are those test string you gave correct? That file you linked has this, and it works with that fixed code:

"apple\npear\norange\n\nruby\npython\nerlang"
->
"apple  \npear  \norange\n\nruby  \npython  \nerlang"

Upvotes: 1

S.Lott
S.Lott

Reputation: 391952

return re.sub(r'^(.+)$',r'\1 ',matchobj.group(0))
                       ^^^--------------------------- you forgot this. 

Upvotes: 0

Related Questions