Anudocs
Anudocs

Reputation: 686

Matching word without dot with Python regex

I have a line in a file which can have below 2 formats:

/begin MEASUREMENT XXX.YYYY "Status ASC" 

and

/begin MEASUREMENT XXXX "Status ASC" 

I want to write an expression which doesn't match the first format but can match the second format and can give me XXXX from second format.

I tried below expression but couldn't get desired result:

/begin\s+MEASUREMENT (\w+)

What changes can I make in my regex?

Upvotes: 1

Views: 173

Answers (2)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626861

You may require a whitespace after \w+:

/begin\s+MEASUREMENT (\w+)(?!\S)
/begin\s+MEASUREMENT (\w+)(?=\s|$)
/begin\s+MEASUREMENT (\w+)(?:\s|$)

See the regex demo and the Regulex graph:

enter image description here

The (?!\S) is a negative lookahead that fails the match if the next char is not a non-whitespace. It is equal in meaning to (?=\s|$), a positive lookahead that requires a whitespace or end of string immediately to the right of the current location. (?:\s|$) is a consuming variation of the latter regex (i.e. the whitespace, if matched, will land in the whole match), but since you are capturing the word before, it should not be a problem.

Upvotes: 2

The fourth bird
The fourth bird

Reputation: 163362

You could make use of a word boundary \b and a negative lookahead (?! to assert what is on the right is not a dot:

/begin\s+MEASUREMENT (\w+)\b(?!\.)

Regex demo

Upvotes: 1

Related Questions