TSCAmerica.com
TSCAmerica.com

Reputation: 5407

Regular Expression Pattern for list and td tag

I am using a script which should read contents in <li> tag, please note that the li tag doesnt have any closing tag

<tr class="colors_background_main"> 
          <td colspan="2"> <li>Tools Color:blue
</td>
        </tr>

What i tried is to use the following regular experssion but it doesnt show any results

<li>\S*(.*)?</td>

Classic ASP code to read the contents

Set objRegExp = New RegExp

strPattern = "(?s)<li>\S*(.*?)</td>"


'<td[^>]*><p>.+?<\/p><\/td>"

objRegExp.Pattern = strPattern
objRegExp.IgnoreCase = True
objRegExp.Global = True

Set colMatches = objRegExp.Execute(strContents)

If colMatches.Count > 0 Then
    strTitle = colMatches(0).Value
Else
    strTitle = ""
End If

Set objRegExp = Nothing

Response.write(strTitle)

Upvotes: 0

Views: 263

Answers (1)

falsetru
falsetru

Reputation: 369354

. does not match newline unless you use DOTALL mode.

Try following:

(?s)<li>\S*(.*?)</td>

This could not work unless your regular expression endigne supports DOTALL mode. In such case, use following regular expression:

<li>\S*([\s\S]*?)</td>

NOTE

The original regular expression will finish at the last . You should write (.*?) which is ungreedy, while yours (.*)? means optional. (Thank you @DioF)

Upvotes: 4

Related Questions