astazed
astazed

Reputation: 649

Parse XML error in VBScript

I have this simple VBScript that sends an HTTP POST request and reads the returning HTML response.

Function httpPOST(url, body, username, password )  
  Set Http = CreateObject("Msxml2.ServerXMLHTTP")   
  Http.Open "POST", url, False, username, password  
  Http.setRequestHeader _  
              "Content-Type", _  
              "application/x-www-form-urlencoded"  
  Http.send body 
  pagestatus = Http.status
  if pagestatus<> "200" then
    httpPOST="Error:"& pagestatus
  else
    'httpPOST = Http.ResponseBody
    'httpPOST = Http.responseText
    Set objXMLDoc = CreateObject("MSXML.DOMDocument")
    objXMLDoc.async = False
    objXMLDoc.validateOnParse = False
    objXMLDoc.load(Http.ResponseBody)
    Set objNode = objXMLDoc.selectSingleNode("/html/body/center/img")
    httpPost = objNode.getAttribute("alt") 
  end if
End Function

The HTML response format is the following:

<html>
    <head>
        <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" />
        <title>---</title>
    </head>
    <body>
        <center>
            <img alt="You are now connected" src="pages/GEN/connected_gen.png">
        </center>
    </body>
</html>

The issue with this script is that it always returns Error: Object required: 'objNode'

I have tried so many variations of XML parsers, and finally gave up for every time I got the same error related to XML objects.

Upvotes: 1

Views: 4537

Answers (1)

Ekkehard.Horner
Ekkehard.Horner

Reputation: 38745

Your first problem is addressed here: .load expects 'A string containing a URL that specifies the location of the XML file'; so use .loadXml to check whether Http.ResponseBody contains data that MSXML?.DOMDocument can parse (your second problem).

UPDATE:

Something that 'works' (and why):

  Dim sHTML : sHTML = readAllFromFile("..\data\02.html")
  WScript.Echo sHTML
  Dim oXDoc : Set oXDoc = CreateObject("MSXML2.DOMDocument")
  oXDoc.async = False
  oXDoc.validateOnParse = False
  oXDoc.setProperty "SelectionLanguage", "XPath"
  If oXDoc.loadXML(sHTML) Then
     Dim ndImg : Set ndImg = oXDoc.selectSingleNode("/html/body/center/img")
     Dim httpPost : httpPost = ndImg.getAttribute("alt")
     WScript.Echo "ok", httpPost
  Else
     WScript.Echo "Error: " & trimWS(oXDoc.parseError.reason)
  End If

output:

<html>
    <head>
        <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" />
        <title>---</title>
    </head>
    <body>
        <center>
            <img alt="You are now connected" src="pages/GEN/connected_gen.png"/>
        </center>
    </body>
</html>

ok You are now connected

MSXML2.DOMDocument will .loadXML (and parse) HTML code, provided it is 'XML-valid'. Your HTML isn't, because the img tag is not closed - the error message I got for your original code:

Error: End tag 'center' does not match the start tag 'img'.

How to proceed further depends on whether you are able/willing to change the HTML.

UPDATE II:

While you could do nasty things to .ResponseBody before you feed it to .loadXML - why not use a HTML tool to parse HTML:

  Dim sHTML : sHTML = readAllFromFile("..\data\01.html")
  WScript.Echo sHTML
  Dim oHF : Set oHF = CreateObject("HTMLFILE")
  oHF.write sHTML
  Dim httpPost : httpPost = oHF.documentElement.childNodes(1).childNodes(0).childNodes(0).alt
  WScript.Echo "ok", httpPost

output:

<html>
    <head>
        <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" />
        <title>---</title>
    </head>
    <body>
        <center>
            <img alt="You are now connected" src="pages/GEN/connected_gen.png">
        </center>
    </body>
</html>

ok You are now connected

As the output shows, HTMLFILE accepts your 'not-xml-closed' img; the method to get what you really want should be sanitized, of course.

Upvotes: 2

Related Questions