Reputation: 3596
I have html document
<value>1,2,3</value>
<value>,1,3,5</value>
and what to extract text with code below but it only prints 'value' tags (css selectors). How to print the text from between tags instead using golang html package ?
z := html.NewTokenizer(b)
for {
tt := z.Next()
switch {
case tt == html.ErrorToken:
return
case tt == html.StartTagToken:
t := z.Token()
isAnchor := t.Data == "value"
if isAnchor {
fmt.Println(t.Data)
}
}
}
Upvotes: 0
Views: 803
Reputation: 189
This seems to work for me:
r := strings.NewReader("<value>1,2,3</value><value>,1,3,5</value>")
doc, err := html.Parse(r)
if err != nil {
log.Fatal(err)
}
var f func(*html.Node)
f = func(n *html.Node) {
if n.Type == html.ElementNode && n.Data == "value" {
fmt.Println(n.FirstChild.Data)
}
for c := n.FirstChild; c != nil; c = c.NextSibling {
f(c)
}
}
f(doc)
I think the key is grabbing the FirstChild after finding the "value" node.
Upvotes: 4
Reputation: 2671
You have to use Text()
method on the next Token
.
if isAnchor := t.Data == "value"; isAnchor {
z.Next()
fmt.Println(z.Text())
}
Upvotes: 1