MBasith
MBasith

Reputation: 1499

Python Regex Sub with Multiple Patterns

I'm trying to match multiple patterns using regex sub grouping and replace the match with an asterisk for a data file that has similar format to the string below. However, I am getting only the desired results for the first match. The subsequent matches are consuming string that I did not expect. Is there a better approach to getting the desired output below?

    import re
    myString = '-fruit apple -number    123 -animal  cat  -name     bob'

    match = re.compile('(-fruit\s+)(\w+)|'
                       '(-animal\s+)(cat)|'
                       '(-name\s+)(bob)')
    print(match.sub('\g<1>*', myString))

Current Output:

-fruit * -number    123 *  *

Desired Output:

-fruit * -number    123 -animal  *  -name     *

Upvotes: 2

Views: 995

Answers (1)

Sebastian Proske
Sebastian Proske

Reputation: 8413

Alternation does not reset the group numbers, thus your groups are numbered like (1)(2)|(3)(4)|(5)(6) but you do only reinsert group 1 - but should do so for groups 3 and 5 too. As non-matched groups are treated as empty string when replacing, you can simply add them to your pattern like \g<1>\g<2>\g<3>*.

On a sidenote I would recommend using raw strings when working with regex patterns (r'pattern'), so you do not have to wonder where to double backslash (e.g. \\b).

Upvotes: 3

Related Questions