Reputation: 20146
We have a script that on a daily basis checks all of the web links in all of our database records (the users want notifications when a link becomes out of date).
There are a couple of sites that work fine through a web browser from this IP address, but when fetched through GO, they either disconnect before completing the request or return a HTTP authorisation denied message.
I am assuming some sort of firewall (F5) is filtering/blocking the request. This occurs even when I change the HTTP request to use a common user agent. What can we do to ensure a GO request looks like a standard browser?
func fetch_url(url string, d time.Duration) (int, error) {
client := &http.Client{
Timeout: d,
}
req, err := http.NewRequest("GET", url, nil)
if err != nil {
return 0, err
}
req.Header.Set("User-Agent", "Mozilla/5.0 (iPad; CPU OS 7_0 like Mac OS X) AppleWebKit/537.51.1 (KHTML, like Gecko) Version/7.0 Mobile/11A465 Safari/9537.53")
resp, err := client.Do(req)
if err != nil {
return 0, err
}
status := resp.StatusCode
resp.Body.Close()
return status, nil
}
Upvotes: 1
Views: 406
Reputation: 156572
Try matching the exact headers from a request from your web browser to eliminate other factors. A smart firewall could have heuristics on what looks like a web browser versus a robot.
Notice that the go http client sends only a minimal HTTP request:
GET /foo HTTP/1.1
Host: localhost:3030
User-Agent: Go 1.1 package http
Accept-Encoding: gzip
Whereas a web browser is more chatty:
GET /foo HTTP/1.1
Host: localhost:3030
Connection: keep-alive
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2272.89 Safari/537.36
Accept-Encoding: gzip, deflate, sdch
Accept-Language: en-US,en;q=0.8
Upvotes: 3