Reputation: 33
I've already spent two weeks searching unsuccessfully how to parse one specific XML and fetch just few values. I already tried every single code on internet until I found one that solved part of my problem.
The XML i'm trying to fetch it's from U.S Department of Agriculture, and is free to access.
https://apps.fas.usda.gov/psdonline/app/index.html#/app/about
Dim xmlDoc As MSXML2.DOMDocument60
Dim xmlNode As MSXML2.IXMLDOMNode
Dim xmlNodeList As MSXML2.IXMLDOMNodeList
Dim myNode As MSXML2.IXMLDOMNode
Dim URL As String, APIkey As String
APIkey = "8DB688F8-1E22-4031-B581-59C221ECDDA6"
URL = "https://apps.fas.usda.gov/PSDOnlineDataServices/api/CommodityData/GetCommodityDataByYear?commodityCode=2222000&marketYear=2018"
Set xmlDoc = New MSXML2.DOMDocument60
xmlDoc.async = False
With CreateObject("MSXML2.XMLHTTP")
.Open "GET", URL, False
.SetRequestHeader "Accept", "text/xml"
.SetRequestHeader "API_KEY", APIkey
.Send
xmlDoc.loadXML .ResponseText
End With
Set xmlNodeList = xmlDoc.getElementsByTagName("*")
For Each xmlNode In xmlNodeList
For Each myNode In xmlNode.childNodes
If myNode.nodeType = NODE_TEXT Then
Debug.Print xmlNode.nodeName & "=" & xmlNode.text
End If
Next myNode
Next xmlNode
Set xmlDoc = Nothing
End Sub
The response of this code show the entire XML listed, but when I try to find one specific node, the code result it's nothing.
in
Set xmlNodeList = xmlDoc.getElementsByTagName("*")
I've tried to use the address "//AttributeDescription", but apparently just work using the "*".
I need to receive, for example, The response below:
AttributeDescription=Production
CountryName=Brazil
Value=0.00000
I did my best trying to get the right response and I also consider that the XML structure it's not in the right format due the lack of response when addressing...
Is there anything that I can do to solve this issue?
Upvotes: 2
Views: 1683
Reputation: 33
I've reached this solution, mixing the two answers, and sharing the code to help others.
First I set the property and then used the iteration to retrieve the values I needed, I don't know if this is the best solution, since I can't control the XML structure and if they change their file I'll need to return to this code.
I tried to work in a "Safety Line" to avoid any mistake in the output, but no problem for me to double check since I have access to the data itself.
If node.childNodes(0).text = "Production" And node.childNodes(6).text = "Argentina" Then
To ensure that the name and response will bring whatever I want.
Public Sub test_3()
Dim xmlDoc As MSXML2.DOMDocument60
Dim URL As String, APIkey As String
APIkey = "8DB688F8-1E22-4031-B581-59C221ECDDA6"
URL = "https://apps.fas.usda.gov/PSDOnlineDataServices/api/CommodityData/GetCommodityDataByYear?commodityCode=2222000&marketYear=2018"
Set xmlDoc = New MSXML2.DOMDocument60
With CreateObject("MSXML2.XMLHTTP")
.Open "GET", URL, False
.SetRequestHeader "Accept", "text/xml"
.SetRequestHeader "API_KEY", APIkey
.Send
xmlDoc.loadXML .ResponseText
xmlDoc.SetProperty "SelectionNamespaces", "xmlns:r='http://schemas.datacontract.org/2004/07/PSDOnline.DataServices.Models'"
End With
Dim node As IXMLDOMElement, r As Long
For Each node In xmlDoc.selectNodes("//r:CommodityData")
If node.childNodes(0).text = "Production" And node.childNodes(6).text = "Argentina" Then
r = r + 1
Debug.Print node.childNodes(0).text
Debug.Print node.childNodes(6).text
Debug.Print node.LastChild.text
'With ActiveSheet
'.Cells(r, 1) = node.childNodes(0).text
'.Cells(r, 2) = node.childNodes(6).text
'.Cells(r, 3) = node.LastChild.text
'End With
End If
Next
End Sub
This solution return the following response in the DEBUGGER:
Production
Argentina
55300.0000
Exactly what I wanted.
Thanks again for the time and for sharing knowledge.
Upvotes: 1
Reputation: 10679
There are two separate issues here.
MSXML2 has issues using XPath when the XML document has a default namespace - see here for details. At the start of the downloaded document from the USDA site, there are some namespace declarations:
<ArrayOfCommodityData xmlns:i="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://schemas.datacontract.org/2004/07/PSDOnline.DataServices.Models">
There are two namespaces declared here. One with the prefix i
and a default namespace that covers any element which does not have a namespace prefix. If you look at a "CalendarYear" entry in the XML document - <CalendarYear i:nil="true" />
- then you can see that "CalendarYear" is in the default namespace whereas "nil" is in the "i" namespace.
To make MSXML2 work with default namespaces, you have to declare a namespace which has the same URI as the default namespace. This is done using the SelectionNamespaces property of the XML document, like this:
xmlDoc.SetProperty "SelectionNamespaces", "xmlns:r='http://schemas.datacontract.org/2004/07/PSDOnline.DataServices.Models'"
I chose r
as the namespace but the name you choose is irrelevant - it just has to be different from any other namespaces in the document.
This leads on to the second problem. You are using getElementsByTagName
which just takes a tag name as a parameter but you are passing in an XPath string. To deal with an XPath string, you need to use SelectNodes
instead and you need to use the namespace we added, like this:
Set xmlNodeList = xmlDoc.SelectNodes("//r:AttributeDescription")
Upvotes: 2
Reputation: 84465
It's a namespace issue I think. There are people more familiar with this who can likely fix how to add properly and then reference. I did try adding the two namespaces with the usual syntax .setProperty "SelectionNamespaces", namespace
but still failed to set objects so guess I did something wrong.
An interim, less robust solution is as follows:
Option Explicit
Public Sub test()
Dim xmlDoc As MSXML2.DOMDocument60
Dim URL As String, APIkey As String
APIkey = "key"
URL = "https://apps.fas.usda.gov/PSDOnlineDataServices/api/CommodityData/GetCommodityDataByYear?commodityCode=2222000&marketYear=2018"
Set xmlDoc = New MSXML2.DOMDocument60
With CreateObject("MSXML2.XMLHTTP")
.Open "GET", URL, False
.SetRequestHeader "Accept", "text/xml"
.SetRequestHeader "API_KEY", APIkey
.Send
xmlDoc.LoadXML .responseText
End With
Dim node As IXMLDOMElement, r As Long
For Each node In xmlDoc.SelectNodes("/*[name()='ArrayOfCommodityData']/*[name()='CommodityData']")
r = r + 1
With ActiveSheet
.Cells(r, 1) = node.ChildNodes(0).Text
.Cells(r, 2) = node.ChildNodes(6).Text
.Cells(r, 3) = node.ChildNodes(11).Text
End With
Next
End Sub
Upvotes: 3