Reputation: 2280
I created a function that reads parts of a body of a HTTP GET response and creates an iterator.
func chunkDownload(ctx context.Context, url string, chunkSizeInBytes uint64) (iter.Seq2[io.Reader, error], error) {
req, err := http.NewRequestWithContext(ctx, http.MethodGet, url, nil)
if err != nil {
return nil, fmt.Errorf("creating http request: %w", err)
}
res, err := http.DefaultClient.Do(req)
if err != nil {
return nil, fmt.Errorf("making http GET request: %w", err)
}
if res.StatusCode != http.StatusOK {
return nil, fmt.Errorf("unsuccesful status code: %s", res.Status)
}
return func(yield func(io.Reader, error) bool) {
buf := make([]byte, chunkSizeInBytes)
defer res.Body.Close()
for {
select {
case <-ctx.Done():
yield(nil, ctx.Err())
return
default:
n, err := res.Body.Read(buf)
if err != nil && err != io.EOF {
yield(nil, fmt.Errorf("unable to read: %w", err))
return
}
fmt.Printf("Read %d bytes\n", n)
if n == 0 {
return
}
if !yield(bytes.NewReader(buf[:n]), nil) {
return
}
}
}
}, nil
}
What I observed is that res.body.Read(buf)
reads only a small number of bytes. Example of the logs:
78 downloaded chunk
Read 24576 bytes
79 downloaded chunk
Read 7168 bytes
80 downloaded chunk
Read 17408 bytes
81 downloaded chunk
Read 26357 bytes
82 downloaded chunk
Read 59392 bytes
83 downloaded chunk
Read 32768 bytes
84 downloaded chunk
Read 27744 bytes
85 downloaded chunk
Why is this the case? What is going on under the hood? Is there any way to get my use case working (I know about Range
requests, and I want to implement this function for servers that do not support it).
Upvotes: 1
Views: 90
Reputation: 11
What I observed is that res.body.Read(buf) reads only a small number of bytes. ... Why is this the case?
The io.Reader documentation explains why:
Read reads up to len(p) bytes into p. ... If some data is available but not len(p) bytes, Read conventionally returns what is available instead of waiting for more.
The call to res.Body.Read(buf)
returns what is available instead of waiting for len(buf)
bytes of data to arrive from the server.
Use io.ReadFull to wait for len(buf)
bytes of data.
n, err := io.ReadFull(res.Body, buf)
This is not your question, but the code in the question does not handle the return values from res.Body.Read(buf)
correctly. Specifically, the code does not handle the case where data and an error is returned. Here's an attempt to improve the code.
n, err := res.Body.Read(buf)
if n > 0 {
// Handle any returned data before handling errors.
if !yield(bytes.NewReader(buf[:n]), nil) {
return
}
}
if err == io.EOF {
// Reached end of stream. Done!
return
}
if err != nil {
// Something bad happened. Yield the error and done!
yield(nil, fmt.Errorf("unable to read: %w", err))
return
}
A better solution is to io.Copy(part, rest.Body)
where part
is wherever the application is writing the data.
Upvotes: 1