Reputation: 125
I'm trying to create a re in python that will match this pattern in order to parse MediaWiki Markup:
<ref>*Any_Character_Could_Be_Here</ref>
But I'm totally lost when it comes to regex. Can someone help me, or point me to a tutorial or resource that might be of some help. Thanks!'
Upvotes: 1
Views: 72
Reputation: 41838
srhoades28, this will match your pattern.
if re.search(r"<ref>\*[^<]*</ref>", subject):
# Successful match
else:
# Match attempt failed
Note that from your post, it is assumed that the * after always occurs, and that the only variable part is the blue text, in your example "Any_Character_Could_Be_Here".
If this is not the case let me know and I will tweak the expression.
Upvotes: 1
Reputation: 11591
Assuming that svick is correct that MediaWiki Markup is not valid xml (or html), then you could use re
in this circumstance (although I will certainly defer to better solutions):
>>> import re
>>> test_string = '''<ref>*Any_Character_Could_Be_Here</ref>
<ref>other characters could be here</ref>'''
>>> re.findall(r'<ref>.*?</ref>', test_string)
['<ref>*Any_Character_Could_Be_Here</ref>', '<ref>other characters could be here</ref>'] # a list of matching strings
In any case, you will want to familiarize yourself with the re
module (whether or not you use a regex to solve this particular problem).
Upvotes: 2