Austin Walela
Austin Walela

Reputation: 325

golang downloading file instead of html page

I would like to download a pgn text file from this URL: http://www.chess.com/echess/download_pgn?lid=1222621131. I have the following (redacted) code which is supposed to do this, but it is downloading a html page instead.What could I be doing wrong?

package main

import (
    "fmt"
    "io"
    "log"
    "net/http"
    "os"
)

func main() {
    url := "http://www.chess.com/echess/download_pgn?lid=1222621131"
    filename := "game.pgn"
    resp, err := http.Get(url)
    ...

    file, err := os.Create(filename)
    defer file.Close()

    ...

    size, err := io.Copy(file, resp.Body)   
}

Upvotes: 0

Views: 2124

Answers (1)

Nate
Nate

Reputation: 5407

First guess is that you've failed to supply all of the normal auth, cookies, and headers a browser session normally would supply. As an experiment, open up Chrome in Incognito mode, then open your developer tools, now in that window hit the URL you GET above. When I do this I look at the first GET in the Network tab in Chrome. Notice the request and response details below. Pay attention to the response code of 302 which means it is found, but you are being redirected. Now take a look for the Location header. It reads '/login'. I suspect this is the very page your code is downloading since your Go program does not have the login session/cookies for this site like your browser does.

There's a lot of work our browsers do to navigate a website. Coding that up from scratch can be a bit of work. You have to pay attention to cookies, authentication, headers, redirects, and more.

Remote Address:174.35.7.172:80
Request URL:http://www.chess.com/echess/download_pgn?lid=1222621131
Request Method:GET
Status Code:302 Found
Response Headers
view parsed
HTTP/1.1 302 Found
Date: Sat, 25 Jul 2015 20:49:43 GMT
Server: PWS/8.1.20.22
X-Px: ms h0-s1027.p12-sjc ( origin)
P3P: CP="ALL DSP COR LAW CURa ADMa DEVa TAIa OUR BUS IND ONL UNI COM NAV DEM CNT"
Cache-Control: private
Pragma: no-cache
Expires: Thu, 19 Nov 1981 08:52:00 GMT
Content-Length: 0
Content-Type: text/html; charset=utf-8
Location: /login
Connection: keep-alive
Set-Cookie: PHPSESSID=pach18her77q4asgsq2heohvj1; path=/; domain=.chess.com; HttpOnly
Request Headers
view parsed
GET /echess/download_pgn?lid=1222621131 HTTP/1.1
Host: www.chess.com
Connection: keep-alive
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/43.0.2357.134 Safari/537.36
Accept-Encoding: gzip, deflate, sdch
Accept-Language: en-US,en;q=0.8,es;q=0.6
Query String Parameters
view source
view URL encoded
lid:1222621131

Upvotes: 2

Related Questions