Reputation: 20136
Looking for a way to simply get the text of a web page, preferably without having to resort to a bunch of regular expressions.
Just thought I'd check first in case this kind of thing is already built in, or at least easier to do in Go.
Upvotes: 2
Views: 1306
Reputation: 116
You could use go-query. This lib can be used like jquery to grep text and doc elements from a html document.
This example is taken from the github page:
package main
import (
"fmt"
"github.com/PuerkitoBio/goquery"
"log"
)
func ExampleScrape() {
doc, err := goquery.NewDocument("http://metalsucks.net")
if err != nil {
log.Fatal(err)
}
doc.Find(".reviews-wrap article .review-rhs").Each(func(i int, s *goquery.Selection) {
band := s.Find("h3").Text()
title := s.Find("i").Text()
fmt.Printf("Review %d: %s - %s\n", i, band, title)
})
}
func main() {
ExampleScrape()
}
Upvotes: 3