AbyxDev
AbyxDev

Reputation: 1555

Regex to match MediaWiki template without certain named parameter

I’ll get to the point: I need a regex that matches any template out of a list that have a date parameter - so assuming that my (singleton for now) list of templates is “stub”, the things below that are in bold should be matched:

Additionally, it would be nice if it could also match if the date parameter is blank, but this is not required.

The current regex I have so far is

{{((?:stub|inaccurate)(?!(?:\|.*?\|)*?\|date=.*?(?:\|.*?)*?)(?:\|.*?)*?)}}

However it matches the fourth and sixth items in the list above.

Note: (?:stub|inaccurate) is just to make sure the template is either a stub or inaccurate template.

Note 2: the flavor of regex here is Python 2.7 module RE.

Upvotes: 0

Views: 104

Answers (2)

Tgr
Tgr

Reputation: 28170

Since you are using Python, you have the luxury of an actual parser:

import mwparserfromhell
wikicode = mwparserfromhell.parse('{{stub|param|date=a|param}}')
for template in wikicode.filter_templates():
    if template.get('date')...

That will remain accurate even if the template contains something you would not have expected ({{stub| date=a}}, {{stub|<!--<newline>-->date=a}}, {{stub|foo={{bar}}|date=a}} etc.). The classic answer on the dangers of using regular expressions to parse complex markup applies to wikitext as well.

Upvotes: 1

Johannes Riecken
Johannes Riecken

Reputation: 2515

I think it's enough to have a negative look-ahead, which tries to match date at any position?

{{((?:stub|inaccurate)(?!.*\|date=).*)}}

If empty date parameters have a | following the equals sign, then use

{{((?:stub|inaccurate)(?!.*\|date=[^|}]).*)}}

Upvotes: 0

Related Questions