[p2p-hackers] UDP file transfer link speed identification
David Barrett
dbarrett at quinthar.com
Thu Jul 21 23:19:30 UTC 2005
Wes Felter wrote:
> Have you read about DCCP, TFRC, and TCP-like Congestion Control?
These are great references, thank you very much. Not only are they good
in isolation, but they helped fill in the cracks in my understanding of
TCP. I'll record what I learned here for others who are similarly
confused. As I do so, can you clarify I'm understanding it correctly?
In essence, TCP maintains a "congestion window" (cwnd) that dictates how
many unacknowledged packets can be "in flight". So for a window size of
1, the server sends one packet, and waits for an explicit ACK before
sending the next. Assuming a 100ms round-trip latency, this means at
most ten packets will be sent a second. As the server increases the
in-flight window size (or as the latency decreases), the server sends at
a faster pace.
The server uses an algorithm named "Additive Increase, Multiplicative
Decrease (AIMD)" to regulate cwnd. Basically, each time an ACK is
received, cwnd is incremented by some amount. However each time cwnd is
to be decreased, it backs off by some set fraction (more on this in a
bit). This means it increases cautiously, and backs off aggressively.
Because the server increases cwnd in response to each ACK, and because
each packet (typically) generates its own ACK, this results in initial
exponential increase in send rate -- even though cwnd is only
incrementing linearly. It took me a bit to wrap my head around why, but
this quote from RFC 2001 cleared me up:
> The sender starts by transmitting one segment and waiting for its
> ACK. When that ACK is received, the congestion window is incremented
> from one to two, and two segments can be sent. When each of those
> two segments is acknowledged, the congestion window is increased to
> four. This provides an exponential growth, although it is not
> exactly exponential because the receiver may delay its ACKs,
> typically sending one ACK for every two segments that it receives.
Somewhat counterintuitively, this exponential rate increase mode of TCP
is called "slow start" -- named so because it starts transmitting very
slowly, though increases quite quickly.
Regardless, at some point it'll start sending at a faster rate than can
be delivered. Routers will discard packets they can't forward, and thus
the sender discovers it's sending too fast when it fails to receive an
ACK for a given packet. In other words, packet loss is what notifies
the sender that it needs to slow down.
So once the sender detects a packet has been dropped, it backs off its
cwnd by some fraction (TCP suggests that it back off to 50%, but [1]
recommends only to 85%) before starting to increase again.
Now, after the slow-start approach hit the ceiling and backed off some
fraction, it could just restart with exponential growth again. This
would again hit the ceiling and back off indefinitely, creating the
up-and-down graph shown in [1], reminiscent of Sisyphus and his boulder.
[1] http://www-lce.eng.cam.ac.uk/~ctk21/scalable/
However, it would be preferable to "level off" at the top send rate and
just stick with that. For this reason after the initial slow-start
approach hits the ceiling, it uses a different algorithm called
"congestion avoidance". Thus rather than quickly shooting up to the
ceiling with exponential growth, it just creeps up using linear growth
until it experiences packet loss again. At that point it just keeps
transmitting at that rate, as graphed in [2].
[2] http://www.csm.ornl.gov/~dunigan/netperf/kelly.html
Now, all this is really simplified, and I recommend the above two
sources, as well as RFC 2581 for more details. Furthermore, there are
variations on how to tweak these algorithms for better performance ([1]
describes one in particular). But this should get you started.
Now, with all that out of the way, here's what I don't understand:
(a) After "congestion avoidance" has been started (ie, cwnd has exceeded
ssthresh), how does it determine when to stop incrementing the cwnd?
I see that [2] clearly shows cwnd (and hence, the send rate) level off
at some ceiling amount. And RFC 2581 clearly states "Congestion
avoidance continues until congestion is detected." (ie, a packet loss
occured). But this is easier said than done.
Assuming TCP is a very simplified finite state machine (FSM) that starts
in "slow start" state and transitions to "congestion avoidance" state
when a packet loss is detected, does it likewise transition from
"congestion avoidance" to "hold steady" once a second packet loss is
detected? If so, under what circumstances does it transition from "hold
steady" back to "collision avoidance" or "slow start"? Furthermore,
does the cwnd get adjusted at all in "hold steady"?
Thank again Wes for the references, and I hope my summary above is
reasonably accurate and can help others better understand the topic.
-david
More information about the P2p-hackers
mailing list