Barney Stinson
Barney Stinson

Reputation: 1002

Regexp is working regex101.com but not in python

I am trying to make a function which gets an array of folder names and a number (which season-folder the function should return) and I want to check if the theres a folder with the right season number [Staffel = Season in German] but I dont just own plain English TV-Shows so my folders are Named Staffel == German TV Show, and Season if its Eng.

In this Example the Folder would contain diffrent folders (d) Im Looking for (Season|Staffel) 2 it should return Season 02 because it occurs before Staffel 2 in the array

def findFolderbyNumber(path, number):
    d = getFolders(path)
    d = ['Staffel 1','Staffel 20','Season 02', 'Staffel 2', 'Season 3']
    number = 2
    for obj in d:
        pattern = '(.*)(Staffel|Season)((\s?)*)((0?)*)('+str(number)+')(\D)(.*)'
        m = re.match(pattern, obj)
        print(obj, end='\tMatch = ')
        print(m)
        if(m):
            return obj
    return 0


Staffel 1   Match = None
Staffel 20  Match = None
Season 02   Match = None
Staffel 2   Match = None
Season 3    Match = None

Upvotes: 2

Views: 1066

Answers (2)

NublicPablo
NublicPablo

Reputation: 1039

Google brought me here, but for me the problem was that I was compiling the expression using compiled_regex = re.compile(myregex, re.IGNORECASE) and then trying to search with re.IGNORECASE flag again (compiled_regex.search(test_str, re.IGNORECASE).

Removing the extra re.IGNORECASE from the search parameters made it work.

compiled_regex = re.compile(myregex, re.IGNORECASE)
compiled_regex.search(test_str)

In case somebody in my situation lands here from Google again.

Upvotes: 0

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626926

You need to replace the last \D with (?!\d).

In your testing, you used a multiline string input and in the code, you test individual strings that have no digit at the end after 2. \D is a consuming pattern, there must be a non-digit char, and the (?!\d) is a negative lookahead, a non-consuming pattern that just requires that the next char cannot be a digit.

Another solution is to replace the last \D with a word boundary \b, but you have to use a raw string literal to avoid issues with escaping (i.e. use r'pattern').

Upvotes: 2

Related Questions