Reputation: 698
Yes yes, I know, "don't parse HTML with Regex". I'm doing this in notepad++ and it's a one-time thing so please bear with me for a moment.
I'm trying to simplify some HTML code by using some more advanced techniques. Notably, I have "inserts" or "callouts" or whatever you call them, in my documentation, indicating "note", "warning" and "technical" short phrases to grab the attention of the reader on important information:
<div class="note">
<p><strong>Notes</strong>: This icon shows you something that complements
the information around it. Understanding notes is not critical but
may be helpful when using the product.</p>
</div>
<div class="warning">
<p><strong>Warnings</strong>: This icon shows information that may
be critical when using the product.
It is important to pay attention to these warnings.</p>
</div>
<div class="technical">
<p><strong>Technical</strong>: This icon shows technical information
that may require some technical knowledge to understand. </p>
</div>
I want to simplify this HTML into the following:
<div class="box note"><strong>Notes</strong>: This icon shows you something that complements
the information around it. Understanding notes is not critical but
may be helpful when using the product.</div>
<div class="box warning"><strong>Warnings</strong>: This icon shows information that may
be critical when using the product.
It is important to pay attention to these warnings.</div>
<div class="box technical"><strong>Technical</strong>: This icon shows technical information
that may require some technical knowledge to understand.</div>
I almost have the regex necessary to do a nice global search & replace in my project from notepad++, but it's not picking up "only" the first div, it's picking up all of them - if my cursor is at the beginning of my file, the "select" when I click Find is from the first <div class="something">
up until the last </div>
, essentially.
Here's my expression: <div class="(.*[^"])">[^<]*<p>(.*?)<\/p>[^<]*<\/div>
(notepad++ "automatically" adds the / / around it, kinda).
What am I doing wrong, here?
Upvotes: 1
Views: 1121
Reputation: 27575
You have a greedy dot-quantifier while matching the class
attribute — that's the evil guy who's causing your problems.
Make it non-greedy: <div class="(.*?[^"])">
or change it to a character class: <div class="([^"]*)">
.
Upvotes: 2