Bryce Thomas
Bryce Thomas

Reputation: 10789

How does TCP slow start increase throughput?

TCP slow start came about in a time when the Internet began experiencing "congestion collapses". The anecdotal example from Van Jacobson and Michael Karels paper goes:

During this period, the data throughput from LBL to UC Berkeley (sites separated
by 400 yards and two IMP hops) dropped from 32 Kbps to 40 bps.

The congestion problem is often described as being caused by the transition from a high-speed link to a slow-speed link, and packet build up/dropping at the buffer at this bottleneck.
What I'm trying to understand is how such a build up would cause a drop in end-to-end throughput, as opposed to simply causing superfluous activity/retransmits on the high-speed portion of the link leading into the full buffer. As an example, consider the following network:

    fast       slow       fast
A ======== B -------- C ======== D

A and D are the endpoints and B and C are the packet buffers at a transition from a high speed to low speed network. So e.g. the link between A/B and C/D is 10Mbps, and link between B/C is 56Kbps. Now if A transmits a large (let's say theoretically infinite) message to D, what I'm trying to understand is why it would take it any longer to get through if it just hammered the TCP connection with data versus adapting to the slower link speed in the middle of the connection. I'm envisaging B as just being some thing whose buffer drains at a fixed rate of 56Kbps, regardless of how heavily its buffer is being hammered by A, and regardless of how many packets it has to discard because of a full buffer. So if A is always keeping B's buffer full (or overfull as may be the case), and B is always transmitting at it's maximum rate of 56Kbps, how would the throughput get any better by using slow-start instead?

The only thing I could think of was if the same packets D had already received were having to be retransmitted over the slow B/C link under congestion, and this was blocking new packets. But wouldn't D have typically ACK'd any packets it had received, so retransmitted packets should be mostly those which legitimately hadn't been received by D because they were dropped at B's buffer?

Upvotes: 2

Views: 3122

Answers (1)

Mike Pennington
Mike Pennington

Reputation: 43077

Remember that networks involve sharing resources between multiple computers. Very simplistically, slow start is required to avoid router buffer exhaustion by a small number of TCP sessions (in your diagram, this is most likely at points B and C)

From RFC 2001, Section 1:

Old TCPs would start a connection with the sender injecting multiple segments into the network, up to the window size advertised by the receiver. While this is OK when the two hosts are on the same LAN, if there are routers and slower links between the sender and the receiver, problems can arise. Some intermediate router must queue the packets, and it's possible for that router to run out of space. [2] shows how this naive approach can reduce the throughput of a TCP connection drastically.

...

[2]  V. Jacobson, "Congestion Avoidance and Control," Computer
    Communication Review, vol. 18, no. 4, pp. 314-329, Aug. 1988.
    ftp://ftp.ee.lbl.gov/papers/congavoid.ps.Z.

Routers must have finite buffers. The larger a speed mismatch between links is, the greater the chance of buffer exhaustion without slow start. After you have buffer exhaustion, your average TCP throughput will go down because buffering increases TCP's ability to utilize links (preventing unnecessary drops for instantaneous link saturation).

Note that RFC 2001 above has been superseded by RFC 5681; however, RFC 2001 offers a more quotable answer to your question.

From your OP...

Now if A transmits a large (let's say theoretically infinite) message to D, what I'm trying to understand is why it would take it any longer to get through if it just hammered the TCP connection with data versus adapting to the slower link speed in the middle of the connection.

First, there is no such thing as an infinite message in TCP. TCP was limited by the initial window size before slow-start came along.

So, let's say the initial TCP segment was 64KB long. If the entire TCP segment fills the router's tx buffer at B, TCP utilizes less of the link over time due to dynamics involved with packet loss, ACKs and TCP back-off. Let's look at individual situations:

  • B's tx_buffer < 64KB: You automatically lost time for retransmissions because A's TCP is sending faster than B can dequeue packets
  • B's tx_buffer >= 64KB: As long as A is the only station transmitting, no negative effects (as long as D is ACK-ing correctly); however, if there are multiple hosts transmitting on A's LAN trying to transit across the 56K link, there are probably problems because it takes 200 milliseconds to dequeue a single 1500 byte packet at 56K. If you have 44 1500-byte packets from A's 64KB initial window (44*1460=64KB; you only get 1460 bytes of TCP payload), the router has a saturated link for almost 9 seconds handling A's traffic.

The second situation is neither fair nor wise. TCP backs off when it sees any packet loss... multiple hosts sharing a single link must use slow start to keep the situation sane.

BTW, I have never seen a router with 9 seconds of buffering on an interface. No user would tolerate that kind of latency. Most routers have about 1-2 seconds max, and that was years ago at T-1 speeds. For a number of reasons, buffers today are even smaller.

Upvotes: 2

Related Questions