Ross Spencer
Ross Spencer

Reputation: 101

Python regex, matching absolute beginning of string, nothing else before

I have the following regex:

(\b)(con)    

This matches:

.con
con

But I only want to match the second line 'con' not '.con'.

This then needs expanding to enable me to match alternative words (CON|COM1|LPT1) etc. And in those scenarios, I need to match the dot afterwards and potentially file extensions too. I have regex for these. I am attempting to understand one part of the expression at a time.

How can I tighten what I've got to give me the specific match I require?

Upvotes: 0

Views: 1401

Answers (2)

user2555451
user2555451

Reputation:

Edit:

You can use non-delimited capture groups and re.match (which is anchored to the start of the string):

>>> from re import match
>>> strs = ["CON.txt", "LPT1.png", "COM1.html", "CON.jpg"]
>>> # This can be customized to what you want
>>> # Right now, it is matching .jpg and .png files with the proper beginning
>>> [x for x in strs if match("(?:CON|COM1|LPT1)\.(?:jpg|png)$", x)]
['LPT1.png', 'CON.jpg']
>>>

Below is a breakdown of the Regex pattern:

(?:CON|COM1|LPT1)  # CON, COM1, or LPT1
\.                 # A period
(?:jpg|png)        # jpg or png
$                  # The end of the string

You may also want to add (?i) to the start of the pattern in order to have case-insensitive matching.

Upvotes: 8

utdemir
utdemir

Reputation: 27216

^ matches start of a string:

^con

would work.

Upvotes: 3

Related Questions