Ya Wang
Ya Wang

Reputation: 1808

Regex match any li of ul that contains text

I have a string

<ul><li>Option to add embroidered text personalization below design<br/>for only $1.00 per shirt and free setup</li><li>Men&#39;s Sizes: XS-6XL</li><li>Individually folded and bagged with size sticker for easy distribution</li><li>Ready to ship in 7 business days after art approval</li></ul>

Trying to match

<li>Men&#39;s Sizes: XS-6XL</li>

I am looking to take only the last <li></li> set that contains words

So for li that contains sizes I am looking to run something like:

(<li>).*?\b[sS]izes[ :]{1}.*?<\/li>

but that selects the first <li> instance instead of the closest.

EDIT: I can't use a html parser here like HTMLAgilityPack.

Upvotes: 0

Views: 1001

Answers (2)

You can use innerHTML and innetText properties like this:

const str = "<ul><li>Option to add embroidered text personalization below design<br/>for only $1.00 per shirt and free setup</li><li>Men&#39;s Sizes: XS-6XL</li><li>Individually folded and bagged with size sticker for easy distribution</li><li>Ready to ship in 7 business days after art approval</li></ul>"
const el1 = document.createElement('div')
el1.innerHTML = str;
let liArr = el1.getElementsByTagName('li')
let resultsText = [] 
let resultsHTML = []
for (const listElement of liArr) {
    if(listElement.innerText.indexOf('Size') >-1){
        resultsText.push(listElement.innerText)
        resultsHTML.push(listElement)
    }
}
console.log('resultsText:::::::::::::')
console.log(resultsText)
console.log('resultsHTML::::::::::::')
console.log(resultsHTML)

Upvotes: 0

D M
D M

Reputation: 7179

I'd use the pattern:

<li>[^<]*[Ss]izes[^<]*<\/li>

Which works like:

Element Matches
<li> The opening tag
[^<]* Zero or more characters that are not the start of a new tag (<)
[Ss]izes The keyword we are looking for
[^<]* Zero or more characters that are not the start of a new tag (<)
<\/li> The closing tag

Try it out!

And I'd take the last such matching element.

Upvotes: 1

Related Questions