Crowman
Crowman

Reputation: 25908

Suppressing matching groups in Python regex

I have the following code to match an escaped string:

match_str = r'''(["/']).*?(?<!\\)(\\\\)*\1'''
test_str = r'''"This is an \"escaped\" string" and this isn't.'''

mo = re.match(match_str, test_str)

if mo:
    print mo.group()

which works fine.

However, while I understand I need the groups in there to handle the repetition, etc., I'm not interested in using the groups after the match. I know I can just call mo.group(0) and get the whole thing, but for what I am doing it would be helpful if it could behave as if no groups were found in this type of case, i.e. that mo.groups() would return (None).

Is there any way to do this?

EDIT: If it helps, I'm trying to do something like this:

ma = [myclass("regex1nogroups", [func1]),
      myclass("regex2twogroups", [func2, func3]),
      myclass("regex3fourgroups", [func4, func5, func6, func7]),
      myclass("regex4nogroups", [func8])]

for mc in ma:
    mo = re.match(mc.pattern, str_to_match)
    if mo:
        for n in range(len(mc.funclist)):
            result = mo.group(n+1 if mo.groups() else 0)
            mc.funclist[n](result)

using the length of the list of functions to determine how many groups the regex should produce. I could add an extra flag member to myclass to be true if I want to just assume there are no groups, but it would be nice to avoid this.

Upvotes: 1

Views: 461

Answers (3)

Nicoolasens
Nicoolasens

Reputation: 3558

If you want to suppress a group in a list of str. use '.str' and replace ie :

df.col_str where ".str" able you to apply str methods on the list.

output :

2             blabla...Length=45
3             bloblo...Length=44
4          VANILLE ...Length=448
5             fooo 1...Length=44
6             Colori...Length=70

but you want to remove ...Length=99 with 99 any numerics. (48, xx,...) so you will use f'?{your_constant_pattern}[0-9]+' with [0-9]+ because I want it ends by any number.

Use a replace :

df.col_str.str.replace(pat="(?:...Length=[0-9]+)", repl="", regex=True)

output :

    2             blabla
    3             bloblo
    4            VANILLE
    5             fooo 1
    6             Colori

or

df.col_str.replace(to_replace="(?:...Length=[0-9]+)", value="", regex=True, inplace=True)

Upvotes: 0

Crowman
Crowman

Reputation: 25908

I ended up just approaching the problem in a different way, and taking the obvious step of looking at the length of the function list, rather than looking at re.groups():

ma = [myclass("regex1nogroups", [func1]),
      myclass("regex2twogroups", [func2, func3]),
      myclass("regex3fourgroups", [func4, func5, func6, func7]),
      myclass("regex4nogroups", [func8])]

for mc in ma:
    mo = re.match(mc.pattern, str_to_match)
    if mo:
        for n,f in enumerate(mc.funclist):
            result = mo.group(n+1 if len(mc.funclist) > 1 else 0)
            f(result)

Upvotes: 0

Blender
Blender

Reputation: 298176

Just add in ?: and you get a non-capturing group:

(?:\\\\)

Upvotes: 3

Related Questions