Reputation: 1163
Is it possible to write a regex pattern so that all matches are exhaustively replaced, without resorting to running the regex multiples times, or using extra libraries like Perl's Regexp::Exhaustive
, Ruby's string.scan(/regex/)
, etc (the language is not important to this question)?
For example, let's say I need to replace a dash -
with \-/
to allow proper hyphenation for compound words in a LaTeX document.
My regex so far would be (PCRE):
s/(\w+)-(\w+)/$1\\-\/$2/ig;
In this admittedly artificial example, it will only replace the first dash.
six-nation-golden-cup-award
will become
six\-/nation-golden-cup-award
Is there a better regex to have it replace all occurences, so that one gets:
six\-/nation\-/golden\-/cup\-/award
Upvotes: 0
Views: 215
Reputation: 4659
Your current regex is overly complicated, it matches the entire strings around the dash. I would do this:
\b-\b
Regex101 demo with substitution
\b
means "word-boundary", so it requires that a word starts or ends there. You can see from the Regex101 link that not every dash is matched. In regexes, "word" characters include numbers and underscores _
, so with this a string like 4-_
would be found and replaced with 4\-/_
.
If you do the lookbehind and lookahead manually, you can define the character classes yourself. So this:
(?<=[a-z])-(?=[a-z])
Would require the preceding/following character to be letters only, no numbers/underscores.
Upvotes: 3