Reputation: 1854
This regex (Regular expression) find all words or group of word who begin with capital letter. But is should exclude words after a dot followed by a space and a word who begin by a capital letters: I.E. it will exclude Hello because a dot and space are preceding the word Hello ". Hello you".
The goal is to replace in a text all included word from the regex by a href link but will exclude ". Any word beginning with Cap letter". It look like:
// EXCLUDE: (. Hello) dot and space precede the capital word )
const regex = /\b((?!\.[\s]+)(?:[A-Z][\p{L}0-9-_]+)(?:\s+[A-Z][\p{L}0-9-_]+)*)\b/ug;
const subst = '<a href="#">$1</a>';
I though that (?!\.[\s]+)
should do the trick but it's not.
Here a test on regex101: https://regex101.com/r/nwyL8I/3
Thank you.
Upvotes: 0
Views: 62
Reputation: 169
The current regular expression seems to match words or groups of words that start with a capital letter and excludes words that are preceded by a dot and a space. However, the exclusion of words after a dot followed by a space might not be working as expected.
One issue with the current regular expression is that it is only checking for the first character of the word after the dot to be a space. You can modify the exclusion part to also check if the first character after the dot is a capital letter:
const regex = /\b((?!\.[A-Z][\p{L}0-9-_])(?:[A-Z][\p{L}0-9-_]+)(?:\s+[A-Z][\p{L}0-9-_]+)*)\b/ug;
This modification should ensure that words that are preceded by a dot and a capital letter will be excluded from the matching process.
Upvotes: 0
Reputation: 44053
The correct way to express a negative lookbehind assertion for your situation would be (?<!\.\s+)
and not (?!\.\s+)
, which is a negative lookahead assertion. So I would use:
((?<!\.\s+)\b(?:[A-Z][\p{L}0-9-_]+)(?:\s+[A-Z][\p{L}0-9-_]+)*)\b
But (?:[A-Z][\p{L}0-9-_]+)
will not match words with a single letter, such as A
. Is that what you really want?
Upvotes: 1