Regexp is working regex101.com but not in python

Question

I am trying to make a function which gets an array of folder names and a number (which season-folder the function should return) and I want to check if the theres a folder with the right season number [Staffel = Season in German] but I dont just own plain English TV-Shows so my folders are Named Staffel == German TV Show, and Season if its Eng.

In this Example the Folder would contain diffrent folders (d) Im Looking for (Season|Staffel) 2 it should return Season 02 because it occurs before Staffel 2 in the array

def findFolderbyNumber(path, number):
    d = getFolders(path)
    d = ['Staffel 1','Staffel 20','Season 02', 'Staffel 2', 'Season 3']
    number = 2
    for obj in d:
        pattern = '(.*)(Staffel|Season)((\s?)*)((0?)*)('+str(number)+')(\D)(.*)'
        m = re.match(pattern, obj)
        print(obj, end='	Match = ')
        print(m)
        if(m):
            return obj
    return 0


Staffel 1   Match = None
Staffel 20  Match = None
Season 02   Match = None
Staffel 2   Match = None
Season 3    Match = None

Wiktor Stribiżew · Accepted Answer

You need to replace the last \D with (?!\d).

In your testing, you used a multiline string input and in the code, you test individual strings that have no digit at the end after 2. \D is a consuming pattern, there must be a non-digit char, and the (?!\d) is a negative lookahead, a non-consuming pattern that just requires that the next char cannot be a digit.

Another solution is to replace the last \D with a word boundary \b, but you have to use a raw string literal to avoid issues with escaping (i.e. use r'pattern').

Regexp is working regex101.com but not in python

Answers (2)

Related Questions