Reputation: 25
I need to find all occurrences of "st" within any class declaration on any html page like this for example:
class="st0 st1 st2", class="st3 st45", class="st678"
I say within a class declararion because there may be other occurrences of "st" throughout the document and I do not want to change every occurrence.
My ultimate goal here is a find and replace. I have the logic written out for that but I just need to figure out how to isolate "st" from the string.
I have experimented with a few different lookaround expressions but I cannot seem to match every occurrence. Below are a few examples of what I have been trying.
This expression gets everything between 'class="' and '"':
Regular Expression:
(?<=class=").*(?=")
Test sting:
class="st10 st11"
Matching result :
"st10 st11"
Here is another one I tried:
Regular Expression:
(?<=class=")((st)\d*\s*)*(?=")
Test sting:
class="st10 st11"
Matching result:
"st10 st11"
Matching groups:
I have been testing my regular expression here at Rubular.com
added from comments
I am going to be using the regular expression within a terminal shell command which I will run on a specific folder. The shell command will do a find and replace on every file that is in the folder like this...
perl -pi -w -e 's/st/stx/g;' ~/Desktop/svg_find_replace/*.svg.
Any help would be much appreciated.
Upvotes: 1
Views: 1018
Reputation: 18490
You can use a regex based on \G
to chain matches.
(?:class="|\G(?!^))(?:(?!st)[^"])*\Kst
(?:
opens a non capturing group for alternation.(?:class="|\G(?!^))
the first part is to set where the match starts. \G
would also match the beginning of the string. To prevent this the negative lookahead (?!^)
is used.(?:(?!st)[^"])*
this part is to match any amount of characters that are not "
and prevent skipping of st
by use of a negative lookahead (?!st)
\K
resets beginning of the reported match.Here is the demo at regex101. It is probably a rather advanced pattern. SO has a nice regex faq.
Upvotes: 1