gsl
gsl

Reputation: 1163

How to exhaustively replace all regex matches?

Is it possible to write a regex pattern so that all matches are exhaustively replaced, without resorting to running the regex multiples times, or using extra libraries like Perl's Regexp::Exhaustive, Ruby's string.scan(/regex/), etc (the language is not important to this question)?

For example, let's say I need to replace a dash - with \-/ to allow proper hyphenation for compound words in a LaTeX document.

My regex so far would be (PCRE):

s/(\w+)-(\w+)/$1\\-\/$2/ig;

In this admittedly artificial example, it will only replace the first dash.

six-nation-golden-cup-award

will become

six\-/nation-golden-cup-award

Is there a better regex to have it replace all occurences, so that one gets:

six\-/nation\-/golden\-/cup\-/award

Upvotes: 0

Views: 215

Answers (1)

asontu
asontu

Reputation: 4659

Your current regex is overly complicated, it matches the entire strings around the dash. I would do this:

\b-\b

Regex101 demo with substitution

\b means "word-boundary", so it requires that a word starts or ends there. You can see from the Regex101 link that not every dash is matched. In regexes, "word" characters include numbers and underscores _, so with this a string like 4-_ would be found and replaced with 4\-/_.

If you do the lookbehind and lookahead manually, you can define the character classes yourself. So this:

(?<=[a-z])-(?=[a-z])

Would require the preceding/following character to be letters only, no numbers/underscores.

Regex101 demo

Upvotes: 3

Related Questions