Reputation: 4023
I have used xml.UnMarshal method to get a struct object but it has it's own limitations. I need a way where I can get all the descendants of a particular type inside a node without specifying the exact xpath.
For example, I have an xml data of the following format:
<content>
<p>this is content area</p>
<animal>
<p>This id dog</p>
<dog>
<p>tommy</p>
</dog>
</animal>
<birds>
<p>this is birds</p>
<p>this is birds</p>
</birds>
<animal>
<p>this is animals</p>
</animal>
</content>
Now I want to traverse through the above xml and process each node and it's children in that order. The problem is that this structure is not fixed and order of elements may change. So I need a way so that I can traverse like
While(Content.nextnode())
{
switch(type of node)
{
//Process the node or traverse the child node deeper
}
}
Upvotes: 25
Views: 20925
Reputation: 36199
You can do it with a vanilla encoding/xml
by using a recursive struct and a simple walk function:
type Node struct {
XMLName xml.Name
Content []byte `xml:",innerxml"`
Nodes []Node `xml:",any"`
}
func walk(nodes []Node, f func(Node) bool) {
for _, n := range nodes {
if f(n) {
walk(n.Nodes, f)
}
}
}
Playground example: http://play.golang.org/p/rv1LlxaHvK.
EDIT: Here's a version with attrs:
type Node struct {
XMLName xml.Name
Attrs []xml.Attr `xml:",any,attr"`
Content []byte `xml:",innerxml"`
Nodes []Node `xml:",any"`
}
func (n *Node) UnmarshalXML(d *xml.Decoder, start xml.StartElement) error {
n.Attrs = start.Attr
type node Node
return d.DecodeElement((*node)(n), &start)
}
Playground: https://play.golang.org/p/d9BkGclp-1.
Upvotes: 44
Reputation: 1291
xmlquery supports parse an XML document as DOM tree to traverse all nodes, like Go's html package.
Upvotes: 1
Reputation: 19824
I did a bit of search on how to deal with generic XML DOM and the closest you can do is use decoder.Token()
or decoder.RawToken()
.
However if you're willing to consider a library I found this one to be very easy to pick up: https://github.com/beevik/etree
doc := etree.NewDocument()
if err := doc.ReadFromFile("bookstore.xml"); err != nil {
panic(err)
}
root := doc.SelectElement("bookstore")
fmt.Println("ROOT element:", root.Tag)
for _, book := range root.SelectElements("book") {
fmt.Println("CHILD element:", book.Tag)
if title := book.SelectElement("title"); title != nil {
lang := title.SelectAttrValue("lang", "unknown")
fmt.Printf(" TITLE: %s (%s)\n", title.Text(), lang)
}
for _, attr := range book.Attr {
fmt.Printf(" ATTR: %s=%s\n", attr.Key, attr.Value)
}
}
It uses the built-in xml parser with in the manner described above.
Upvotes: 4