Cade
Cade

Reputation: 99

Go Colly not returning any data from website

I am trying to make a simple web scraper in go and I can't seem to get the most simple functionality from colly. I took the basic example from the colly docs and while it worked with the hackernews.org site they used it isn't working with the site I am trying to scrape. I tried several iterations of the url ie with https://, www. , with / at the end etc and nothing seems to work. I tried scraping the same site with beatiful soup in python and got everything so i know the site can be scraped. Any help is appreciated. Thanks.

package main

import (
    "fmt"

    "github.com/gocolly/colly"
)

// main function  
func main() {
    /* instatiate colly */
    c := colly.NewCollector(
        colly.AllowedDomains("www.bjjheroes.com/"),
    )

    // On every a element which has href attribute call callback
    c.OnHTML("a[href]", func(e *colly.HTMLElement) {
        fmt.Printf("Link found: %q \n", e.Text)
    })

    c.Visit("www.bjjheroes.com/a-z-bjj-fighters-list")
}

Upvotes: 4

Views: 2283

Answers (1)

Cade
Cade

Reputation: 99

  • The "error" was on my part in that the allowed domains needed several more variations, after adding
        colly.AllowedDomains(
                  "www.bjjheroes.com/", 
                  "bjjheroes.com/",
                  "https://bjjheroes.com/",
                  "www.bjjheroes.com", 
                  "bjjheroes.com",
                  "https://bjjheroes.com",
                ),

everything worked

Upvotes: 3

Related Questions