David912
David912

Reputation: 396

Using getElementsByClassName with VBA

I wrote the following code in order to retrieve data from an API and hopefully get the content of a specific element. I wrote similar code that works with web pages (not API) but in this example it breaks and don't understand why.

Sub parseXML()
Dim xmldoc As Object
Dim obj As Object
Dim MyRequest As Object

Set MyRequest = CreateObject("MSXML2.XMLHTTP")
MyRequest.Open "GET", 
"https://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi? db=pubmed&term=ABL1%20Acute%20granulocytic%20leukemia"     
 MyRequest.send
 While Not MyRequest.readyState = 4
       DoEvents
    Wend
Set xmldoc = MyRequest.responseXML
Set obj = xmldoc.DocumentElement.getElementsByClassName("ui-ncbihistogram-display-area")(0) 'Error: object doesn't support this property or method 
End Sub 

How should I write the last line in order to remove the error? I am trying to get the element class="ui-ncbihistogram-display-area". It has several <li>elements and at the end I want to retrieve the inner text of each of these<li> elements.

Upvotes: 1

Views: 863

Answers (1)

QHarr
QHarr

Reputation: 84465

You are trying to use the syntax that would work for an HTML Parser selecting off the pubmed search for the same term i.e. you would be selecting for the histogram on this page.

However, with the API call you are making you are getting an XML document which will have specified fields according the API documentation. Instead, you would use xpath to specify what you want if using xml parser (this will allow you to include more specificity in terms of attributes and parent child relationships). If you are going to use HTML parser then you want to look at selecting by tag. The field present in the webpage, for the histogram, is not present in the response. Nor is that information present in the return xml document. The figure that sort of matches is you get the total result count for the query i.e. 248. Though it is broken down by year in the webpage histogram whereas it is a total in the XML API response.

Sample comparison:

enter image description here

I would recommend using an XML parser for XML. You then get the expected result hopefully. Compare the two parsers here where there is loss of content in the HTMLDocument.

Public Sub DifferentParsing()
    Dim xmlDoc As Object, htmlDoc As HTMLDocument, url As String

    Set xmlDoc = CreateObject("MSXML2.DOMDocument") 'New MSXML2.DOMDocument60
    Set htmlDoc = New HTMLDocument
    url = "https://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi?%20db=pubmed&term=ABL1%20Acute%20granulocytic%20leukemia"

    With CreateObject("MSXML2.XMLHTTP")
        .Open "GET", url, False
        .send
        htmlDoc.body.innerHTML = .responseText
        xmlDoc.LoadXML .responseXML.XML
    End With
    Debug.Print xmlDoc.SelectSingleNode("//Count").Text
    Debug.Print htmlDoc.getElementsByTagName("Count")(0).innerText
End Sub

Upvotes: 2

Related Questions