masahs
masahs

Reputation: 25

Regular Expression to find all occurrences of a word within a lookaround expression

I need to find all occurrences of "st" within any class declaration on any html page like this for example:

class="st0 st1 st2", class="st3 st45", class="st678"

I say within a class declararion because there may be other occurrences of "st" throughout the document and I do not want to change every occurrence.

My ultimate goal here is a find and replace. I have the logic written out for that but I just need to figure out how to isolate "st" from the string.

I have experimented with a few different lookaround expressions but I cannot seem to match every occurrence. Below are a few examples of what I have been trying.

This expression gets everything between 'class="' and '"':

Regular Expression:

(?<=class=").*(?=")

Test sting:

class="st10 st11"

Matching result :

"st10 st11"

Here is another one I tried:

Regular Expression:

(?<=class=")((st)\d*\s*)*(?=")

Test sting:

class="st10 st11"

Matching result:

"st10 st11"

Matching groups:

  1. st11
  2. st

I have been testing my regular expression here at Rubular.com

added from comments
I am going to be using the regular expression within a terminal shell command which I will run on a specific folder. The shell command will do a find and replace on every file that is in the folder like this...

perl -pi -w -e 's/st/stx/g;' ~/Desktop/svg_find_replace/*.svg.

Any help would be much appreciated.

Upvotes: 1

Views: 1018

Answers (1)

bobble bubble
bobble bubble

Reputation: 18490

You can use a regex based on \G to chain matches.

(?:class="|\G(?!^))(?:(?!st)[^"])*\Kst
  • (?: opens a non capturing group for alternation.
  • (?:class="|\G(?!^)) the first part is to set where the match starts. \G would also match the beginning of the string. To prevent this the negative lookahead (?!^) is used.
  • (?:(?!st)[^"])* this part is to match any amount of characters that are not " and prevent skipping of st by use of a negative lookahead (?!st)
  • \K resets beginning of the reported match.

Here is the demo at regex101. It is probably a rather advanced pattern. SO has a nice regex faq.

Upvotes: 1

Related Questions