SC-HELP
SC-HELP

Reputation: 21

Grabbing all content within <ul> tag using Regex

I have inherited a website, in which I am having to update about 3500 files with very 95% similar content in each (product pages).

In order to make some changes, I am using Regex (in Dreamweaver) to do some bulk editing.

I've been able to get everything done ok, but I am running into a problem with content within a tag.

I need to be able to grab all the content within that tag and save it for when I replace the other content on the page (this is one of the few things whose content is different from page to page).

Here is an example:

<ul>
<li style="padding-top:10px; text-align:right;"><a href="http://www.website.com/additem.wws?Sku=ABC123&sup=AAA&mfr=BBB&price=99.99&core=10.00&qty=1&description=ITEM">Single Item - $99.99 <img src="../../images/buy-now-button.gif" alt="Buy Now" width="50" height="20" border="0">&nbsp;&nbsp;&nbsp;&nbsp;</a></li>
<li style="padding-top:10px; text-align:right;"><a href="http://www.website.com/additem.wws?Sku=ABC123-6&sup=AAA&mfr=BBB&price=299.99&core=60.00&qty=1&description=INJECTOR"><strong>Set of 6 Items - $299.99</strong> <img src="../../images/buy-now-button.gif" alt="Buy Now" width="50" height="20" border="0">&nbsp;&nbsp;&nbsp;&nbsp;</a></li>
<li style="padding-top:10px"><img src="../../images/free_shipping.jpg" alt="Free Upgrade." width="227" height="107">  </li>
</ul>

I would go more individually and get the content in the individual <li> tabs, but the problem is that some pages have only one <li> within the <ul>, or up to 6 depending on the number of product variations on that page.

So my overall question is this: how do I grab all the content (including new lines, other tags, etc.) within a given tag and save it for when the rest of the content needs to be replaced? I know how to use parentheses around the content and then $# in the Replace section.

The websites I've worked on thus far have been much smaller, and I've not had much need for Regex because it was typically easier to make changes manually or just using literal text in Find/Replace.

Upvotes: 2

Views: 4646

Answers (1)

Alan Moore
Alan Moore

Reputation: 75232

How complex are these web pages? If <ul> elements are never nested inside other <ul> elements, and you don't have to deal with bogus tags inside (for example) SGML comments or CDATA sections, this is probably all you need:

<ul>[\s\S]*?</ul>

[\s\S] is how you match any character including newlines in JavaScript regexes (which is what Dreamweaver uses, or so I've read).

*? tells it to match zero or more, reluctantly--meaning it quits matching as soon as it becomes possible for the next part of the regex (</ul>) to match.

Upvotes: 7

Related Questions