Reputation: 328
I want to send a fairly large number (several thousand) of HTTP requests ASAP, without putting too much load on the CDN (has an https: URL, and ALPN selects HTTP/2 during the TLS phase) So, staggering (i.e. time shifting) the requests is an option, but I don't want to wait TOO long (minimize errors AND total round-trip time) and I'm not being rate limited by the server at the scale I'm operating yet.
The problem I'm seeing originates from h2_bundle.go
and specifically in either writeFrame
or onWriteTimeout
when about 500-1k requests are in-flight, which manifests during io.Copy(fileWriter, response.Body)
as:
http2ErrCodeInternal = "INTERNAL_ERROR" // also IDs a Stream number
// ^ then io.Copy observes the reader encountering "unexpected EOF"
I'm fine sticking with HTTP/1.x for now, but I would love an explanation re: what's going on. Clearly, people DO use Go to make a lot of round-trips happen per unit time, but most advice I can find is from the perspective of the server, not clients. I've already tried specifying all the relevant time-outs I can find, and cranking up connection pool max sizes.
Upvotes: 2
Views: 2440
Reputation: 328
Here's my best guess at what's going on:
The rate of requests is overwhelming a queue of connections or some other resource in the HTTP/2 internals. Maybe this is fix-able in general or possible to fine-tune for my specific use case, but the fastest way to overcome this kind of problem is to rely on HTTP/1.1 entirely, and then implement a limited form of retry + rate-limiting mechanisms.
Aside, I am now using a single retry and
rate.Limiter
from https://pkg.go.dev/golang.org/x/time/rate#Limiter in addition to the "ugly hack" of disabled HTTP/2, so that outbound requests are able send an initial "burst" of M requests, and then "leak more gradually" at a given rate of N/sec. Ultimately, the errors fromh2_bundle.go
are just too ugly for end-users to parse. In my humble opinion, any expected/unexpected EOF may result in the client "giving it another try" or two, which is more pragmatic anyway.
As per the docs, the easiest way to disable h2 in Go's http.Client
at runtime is env GODEBUG=http2client=0 ...
which I can also achieve in other ways below. It's especially important to understand: the "next protocol" is pre-negotiated "early" during TLS, so Go's http.Transport
must manage that configuration along with a singleton cache/memo to provide that functionality in a performant way. Therefore, use your own httpClient
to .Do(req)
(and don't forget to give your Request a context.Context
so that it's trivial to "cancel" it) using a custom http.RoundTripper
for Transport. Here's some example code:
import (
"net/http"
"time"
)
type forwardRoundTripper struct {
rt http.RoundTripper // e.g. an *http.Transport
}
func (my *forwardRoundTripper) RoundTrip(r *http.Request) (*http.Response, error) {
// set r = r.WithContext(ctx) for some `ctx` if you desire
// adjust URL, or this transport (as necessary, per-request)
// NOTE: A very common thing is to add HTTP headers here
return my.rt.RoundTrip(r)
}
// httpClient has an http.RoundTripper given for general Transport!
// (don't forget to choose a reasonable CheckRedirect and Jar/etc.)
var httpClient = &http.Client{
Timeout: time.Second * 10, // or whatever you prefer here
Transport: &forwardRoundTripper{rt: http.DefaultTransport},
}
func h2Disabled(rt *http.Transport) *http.Transport {
log.Println("--- only using HTTP/1.x ...")
rt.ForceAttemptHTTP2 = false // not good enough
// at least one of the following is ALSO required:
rt.TLSClientConfig.NextProtos = []string{"http/1.1"}
// need to Clone() or replace the TLSClientConfig if a request already occurred
// - Why? Because the first time the transport is used, it caches certain structures.
// (if you do this replacement, don't forget to set a minimum TLS version)
rt.TLSHandshakeTimeout = longTimeout // not related to h2, but necessary for stability
rt.TLSNextProto = make(map[string]func(authority string, c *tls.Conn) http.RoundTripper)
// ^ some sources seem to think this is necessary, but not in all cases
// (it WILL be required if an "h2" key is already present in this map)
return rt
}
func init() {
h := httpClient
h2ok := ... // e.g. cmp.Or(os.Getenv("IS_H2_OK"), "yes") == "yes"
if t, ok := h.Transport.(*forwardRoundTripper); ok && !h2ok {
h2t, _ := t.rt.(*http.Transport) // ok
h.Transport = h2Disabled(h2t.Clone())
// recommended: log at warn (h2 disabled)
}
// log at info about Client
// tweak rate limits here
}
This allows us to make the volume of requests that we'll need to, OR get more-reasonable errors in edge cases that we previously suffered.
Upvotes: 1