Demosthene
Demosthene

Reputation: 359

Ignore comments with whitespace in regex

How can I modify the following line of code (which reads the names of parameters in config_file):

re.findall('Parameter.*', config_file)

so as to ignore lines containing a comment symbol (%) to the left? i.e. in the following example,

Parameter: A
%Parameter: B
  %  Parameter: C
 Parameter: D %the best parameter

only A and D match?

Upvotes: 4

Views: 420

Answers (2)

Gurmanjot Singh
Gurmanjot Singh

Reputation: 10360

Try this Regex:

(?:(?<=^)|(?<=\n))\s*Parameter.*

Click for Demo

Explanation:

  • (?:(?<=^)|(?<=\n)) - finds the position which is just preceded by a \n or start-of-the-string
  • \s* - matches 0+ occurrences of white-spaces
  • Parameter.* - matches Parameter followed by 0+ occurrences of any character(except newline characters)

Upvotes: 3

anubhava
anubhava

Reputation: 785108

You can make use of regex alternation and capturing groups in findall:

>>> test_str = ("Parameter: A\n"
...     "%Parameter: B\n"
...     "  %  Parameter: C\n"
...     " Parameter: D %the best parameter")
>>>
>>> print filter(None, re.findall(r'%\s*Parameter|(Parameter.*)', test_str))
['Parameter: A', 'Parameter: D %the best parameter']

Matches that you want to discard should appear before the last capturing group match in an alternation.

RegEx Demo

Upvotes: 1

Related Questions