bli00
bli00

Reputation: 2797

Stream window size with regards to HTTP/2 and stream performance

I came across the concept of window size when browsing gRPC's dial options. Because gRPC uses HTTP/2 underneath, I dug this article up, which describes:

Flow control window is just nothing more than an integer value indicating the buffering capacity of the receiver. Each sender maintains a separate flow control window for each stream and for the overall connection.

If this is the window size gRPC is talking about and I understand this correctly. This is for HTTP/2 to maintain multiple concurrent stream within the same connection. Basically a number that's advertised to the sender about how much data the receiver wants the sender to send next. For control flow reasons, the connection puts different stream's data among different windows in serial.

My question is/are: is the window all or nothing? Meaning if my window size is n bytes, the stream won't send any data until it's accumulated at least n bytes? More generally, how do I maximize the performance of my stream if I maintain only one stream? I assume a bigger window size would help avoid overheads but increase risk for data loss?

Upvotes: 4

Views: 3488

Answers (1)

sbordet
sbordet

Reputation: 18597

Meaning if my window size is n bytes, the stream won't send any data until it's accumulated at least n bytes?

No. The sender can send any number of bytes less than or equal to n.

More generally, how do I maximize the performance of my stream if I maintain only one stream?

For just one stream, just use the max possible value, 2^31-1.
Furthermore, you want to configure the receiver to send WINDOW_UPDATE frames soon enough, so that the sender always has a large enough flow control window that allows it to never stop sending.

One important thing to note is that the configuration of the max flow control window is related to the memory capacity of the receiver.

Since HTTP/2 is multiplexed, the implementation must continue to read data until the flow control window is exhausted.
Using the max flow control window, 2 GiB, means that the receiver needs to be prepared to buffer at least up to 2 GiB of data, until the application decides to consume that data.

In other words: reading the data from the network by the implementation, and consuming that data by the application may happen at different speeds; if reading is faster than consuming, the implementation must read the data and accumulate it aside until the application can consume it.

When the application consumes the data, it tells the implementation how many bytes were consumed, and the implementation may send a WINDOW_UPDATE frame to the sender, to enlarge the flow control window again, so the sender can continue to send.

Note that implementations really want to apply backpressure, i.e. wait for applications to consume the data before sending WINDOW_UPDATEs back to the sender.
If the implementation (wrongly) acknowledges consumption of data before passing it to the application, then it is open to memory blow-up, as the sender will continue to send, but the receiver is forced to accumulate it aside until the host memory of the receiver is exhausted (assuming the application is slower to consume data than the implementation to read data from the network).

Given the above, a single connection, for the max flow control window, may require up to 2 GiB of memory. Imagine 1024 connections (not that many for a server), and you need 2 TiB of memory.

Also consider that for such large flow control windows, you may hit TCP congestion (head of line blocking) before the flow control window is exhausted.
If this happens, you are basically back to the TCP connection capacity, meaning that HTTP/2 flow control limits never trigger because the TCP limits trigger before (or you are otherwise limited by bandwidth, etc.).

Another consideration to make is that you want to avoid that the sender exhausts the flow control window and therefore is forced to stall and stop sending.

For a flow control window of 1 MiB, you don't want to receive 1 MiB of data, consume it and then send back a WINDOW_UPDATE of 1 MiB, because otherwise the client will send 1 MiB, stall, receive the WINDOW_UPDATE, send another 1 MiB, stall again, etc. (see also how to use Multiplexing http2 feature when uploading).

Historically, small flow control windows (as the one suggested in the specification of 64 KiB) were causing super-slow downloads in browsers, that quickly realized that they needed to tell servers that their flow control window was large enough so that the server would not stall the downloads. Currently, Firefox and Chrome set it at 16 MiB.

You want to feed the sender with WINDOW_UPDATEs so it never stalls.

This is a combination of how fast the application consumes the received data, how much you want to "accumulate" the number of consumed bytes before sending the WINDOW_UPDATE (to avoid sending WINDOW_UPDATE too frequently), and how long it takes for the WINDOW_UPDATE to go from receiver to sender.

Upvotes: 4

Related Questions