Reputation: 1495
I have a list of files, and I want to keep only the ones which start with 'test_' and end with '.py'. I want the regex to only return the text inside the 'test_' and '.py'. I do not want .pyc files included.
I have tried:
>>>filename = 'test_foo.py'
>>>re.search(r'(?<=test_).+(?=\.py)', filename).group()
foo.py
but it still returns the extension, and will allow '.pyc' extensions (which I do not want). I'm pretty sure it's the '+' which is consuming the whole string.
This works as a fallback, but I would prefer a regex solution:
>>>filename = 'test_foo.py'
>>>result = filename.startswith('test_') and filename.endswith('.py')
>>>result = result.replace('test_', '').replace('.py', '')
>>>print result
foo
Upvotes: 8
Views: 27556
Reputation:
Look at this:
import re
files = [
"test_1.py",
"Test.py",
"test.pyc",
"test.py",
"script.py"]
print [x for x in files if re.search("^test_.*py$", x)]
output:
['test_1.py']
Upvotes: 2
Reputation: 149000
The problem is that your pattern matches any string that comes after by test_
and before .py
, but that doesn't restrict it from having other characters before the test_
or after the .py
.
You need to use start (^
) and end ($
) anchors. Also, don't forget to escape the .
character. Try this pattern:
(?<=^test_).+(?=\.py$)
Upvotes: 9