srhoades28
srhoades28

Reputation: 125

RegEx in Python for WikiMarkup

I'm trying to create a re in python that will match this pattern in order to parse MediaWiki Markup:

<ref>*Any_Character_Could_Be_Here</ref>

But I'm totally lost when it comes to regex. Can someone help me, or point me to a tutorial or resource that might be of some help. Thanks!'

Upvotes: 1

Views: 72

Answers (2)

zx81
zx81

Reputation: 41838

srhoades28, this will match your pattern.

if re.search(r"<ref>\*[^<]*</ref>", subject):
    # Successful match
else:
    # Match attempt failed

Note that from your post, it is assumed that the * after always occurs, and that the only variable part is the blue text, in your example "Any_Character_Could_Be_Here".

If this is not the case let me know and I will tweak the expression.

Upvotes: 1

Justin O Barber
Justin O Barber

Reputation: 11591

Assuming that svick is correct that MediaWiki Markup is not valid xml (or html), then you could use re in this circumstance (although I will certainly defer to better solutions):

>>> import re
>>> test_string = '''<ref>*Any_Character_Could_Be_Here</ref>
<ref>other characters could be here</ref>'''
>>> re.findall(r'<ref>.*?</ref>', test_string)
['<ref>*Any_Character_Could_Be_Here</ref>', '<ref>other characters could be here</ref>']  # a list of matching strings

In any case, you will want to familiarize yourself with the re module (whether or not you use a regex to solve this particular problem).

Upvotes: 2

Related Questions