Reputation: 704
I have a variable filled with structured HTML content (a web site content) and I simply want to get content from a "div" called article and it looks like;
<article>
html stuff here html stuff here html stuff here html stuff here
html stuff here html stuff here html stuff here html stuff here
</article>
I'm trying with:
Dim url
url="myUrl"
Set objXML = CreateObject("MSXML2.ServerXMLHTTP")
Set myDiv = New RegExp
With myDiv
.Pattern = "<article>.*</article>"
.IgnoreCase = True
.Global = false
End With
objXML.Open "GET", url, False
objXML.Send("")
html= objXML.responseText
Set objMatch = myDiv.Execute(html)
for each x in objMatch
WScript.Echo objMatch.Item(0)
next
or .Pattern = "#<article>([^<]*)</article>#'"
or .Pattern = "<article>([^<]*)</article>'"
With no luck, any suggestion?
Upvotes: 0
Views: 226
Reputation: 3175
Use this Regex
(?<=\<article\>)([\s\S]*)(?=\<\/article>)
Example (not-tested)
Set myDiv = New RegExp
With myDiv
.Pattern = "(?<=\<article\>)([\s\S]*)(?=\<\/article>)"
.IgnoreCase = True
.Global = false
End With
Upvotes: 1