Reputation: 1982
I'm scraping HTML pages and have set up a HTTP client like so:
client := *http.Client{
Transport: &http.Transport{
Dial: (&net.Dialer{
Timeout: 30 * time.Second,
KeepAlive: 30 * time.Second,
}).Dial,
TLSHandshakeTimeout: 10 * time.Second,
ResponseHeaderTimeout: 10 * time.Second,
},
}
Now when I make GET requests of multiple URLs I don't want to get stuck with URLs that deliver massive amount of data.
response, err := client.Get(page.Url)
checkErr(err)
body, err := ioutil.ReadAll(response.Body)
checkErr(err)
page.Body = string(body)
Is there a way to limit the amount of data (bytes) the GET request accepts from a resource and stops?
Upvotes: 8
Views: 9891
Reputation: 109405
Use an io.LimitedReader
A LimitedReader reads from R but limits the amount of data returned to just N bytes.
limitedReader := &io.LimitedReader{R: response.Body, N: limit}
body, err := io.ReadAll(limitedReader)
or
body, err := io.ReadAll(io.LimitReader(response.Body, limit))
Upvotes: 29
Reputation: 1
You can use io.CopyN
:
package main
import (
"io"
"net/http"
"os"
)
func main() {
r, e := http.Get("http://speedtest.lax.hivelocity.net")
if e != nil {
panic(e)
}
defer r.Body.Close()
io.CopyN(os.Stdout, r.Body, 100)
}
Or Range
header:
package main
import (
"net/http"
"os"
)
func main() {
req, e := http.NewRequest("GET", "http://speedtest.lax.hivelocity.net", nil)
if e != nil {
panic(e)
}
req.Header.Set("Range", "bytes=0-99")
res, e := new(http.Client).Do(req)
if e != nil {
panic(e)
}
defer res.Body.Close()
os.Stdout.ReadFrom(res.Body)
}
Upvotes: 1