Reputation: 101
I have the following regex:
(\b)(con)
This matches:
.con
con
But I only want to match the second line 'con' not '.con'.
This then needs expanding to enable me to match alternative words (CON|COM1|LPT1) etc. And in those scenarios, I need to match the dot afterwards and potentially file extensions too. I have regex for these. I am attempting to understand one part of the expression at a time.
How can I tighten what I've got to give me the specific match I require?
Upvotes: 0
Views: 1401
Reputation:
Edit:
You can use non-delimited capture groups and re.match
(which is anchored to the start of the string):
>>> from re import match
>>> strs = ["CON.txt", "LPT1.png", "COM1.html", "CON.jpg"]
>>> # This can be customized to what you want
>>> # Right now, it is matching .jpg and .png files with the proper beginning
>>> [x for x in strs if match("(?:CON|COM1|LPT1)\.(?:jpg|png)$", x)]
['LPT1.png', 'CON.jpg']
>>>
Below is a breakdown of the Regex pattern:
(?:CON|COM1|LPT1) # CON, COM1, or LPT1
\. # A period
(?:jpg|png) # jpg or png
$ # The end of the string
You may also want to add (?i)
to the start of the pattern in order to have case-insensitive matching.
Upvotes: 8