[p2p-hackers] Hard question....
matthew at matthew.at
Sun Apr 2 03:38:47 UTC 2006
> While I agree that TCP flow control is good and all, I worry a bit about
> the TCP high-horse and the many newbies who misunderstand it.
I worry more about the newbies who don't understand how much a host's TCP
implementation is doing for them, and go off and naively implement UDP-based
bulk transfer or streaming protocols.
TCP handles RTT calculation (including a good try at not getting it wrong in
the face of extra retransmissions), retransmission timing, a sliding window
(instead of lock-step wait-for-ack), flow control against the receiver's
buffer *and* a good attempt at congestion control when loss is detected.
> implicating anyone, it's worth pointing out that TCP is not sacrosanct, it
> does not provide immunity from congestion, and it does not guarantee
> fair bandwidth sharing at the host level.
True enough. TCP is also showing its age... Even with window scaling, large
delay*bandwidth isn't well-tolerated. The AIMD algorithm isn't sufficient
for large delay*bandwidth either, especially if there's slight nonzero loss.
TCP doesn't have selective acknowlegements by default, and the
rarely-implemented SACK specification actually allows the receiver to renege
on acknowledgement claims, which causes transmit buffer issues when lots of
data is in-flight. And most TCP implementations don't properly implement the
specification with regard to capping the max RTO, so brief link outages take
much longer to recover from than they should. And there's *still* the SYN
flood problem, session hijacking potential, and everything else that's been
discovered over the years.
> I can create hundreds of TCP (or TCP-like) flows in parallel, easily
> than my fair share of bandwidth, and easily create congestion at the
> closing and creating TCP connections (slow start, anyone?). Many p2p apps
> exactly that: open many connections to many other hosts.
Sure. You could also use TCP to saturate your link simply by issuing
millions of simultaneous new connections. But TCP is the standard for how a
bulk transfer flow should behave in the face of loss. There are traffic
shaping devices that can detect flows that fail to back off *like TCP does*
in the presence of loss, and they will severely penalize such flows. That's
a great reason to use a TCP-like or TCP-friendly algorithm for congestion
Another great reason is to look at the behavior of TCP flow if a parallel
flow takes more or less than what TCP would, given the same average loss and
same RTT... What you'll discover is that TCP operates essentially on a
knife-edge... Take a little more, and you'll drown out the TCP flows. Be a
little more timid, and TCP will take most of the available bandwidth. The
amicima MFP implementation knows about this and uses it to its advantage
when using priority to adapt congestion response, but not knowing at all and
naively ignoring the situation will make users unhappy when they start
trying to do two things at once.
> In fact, I'm cranky at the moment because some idiot's p2p download is
> all the bandwidth at my current wireless hotspot. Maybe what we need is to
extend the TCP
> ideas from the flow level to the host-level (and either embed them deep
into the OS
> or enforce them via traffic shaping).
amicima's MFP does share congestion state between all flows that travel
between a given pair of hosts, which results in much better behavior in the
case where you have multiple parallel file transfers... In addition, there's
flow prioritization, which allows a higher priority flow (eg., a VOIP flow)
to get first dibs on the available bandwidth, rather than simply taking its
chances. And MFP also shares received priority data with all the other hosts
it is talking to, so that if A is sending high priority data to C, and B is
sending low priority data to C, B knows to be more aggressive in backing off
if it detects loss so as to leave inbound room at C for the flow from A.
Obviously if you throw TCP flows into the mix, you don't get all the
benefits, but you do still get *tested* TCP-friendly performance (some of
the TCP-friendly rate control algorithms are actually quite poor in real
life, due to excess time constant in their feedback loop or other subtle
Getting congestion control to work properly in MFP took the majority of our
development time. We tried several alternative approaches... TFRC-like
algorithms, explicit loss reporting vs. deriving loss from acknowledgements,
token-bucket rate shaping vs. data-in-flight control. The theory says that a
whole lot of things will work. In practice, there's only a few that operate
correctly in real life, and there's a lot of tricks (eg., how we calculate
RTT) that improve performance more than you might expect at first glance.
Knowing now how many programmer hours it took to get it to even a passable
state, I wouldn't recommend the exercise to anyone.
> That said, it's better to use a protocol with built-in congestion control
Absolutely. In fact, for bulk transfer or streaming media, developers should
consider congestion control *mandatory* for proper behavior on the Internet,
and yes, that *should* include RTP VOIP flows too.
> and it's better to adopt TCP's flow control than either nothing or
something untested at large.
TCP's flow control (and of course there's several flavors... Reno, Vegas,
etc.) is both a good start, and what almost all the other traffic is
using... So you either need to emulate it, or come up with something that
interoperates fairly when the majority of the other parallel flows *are*
And if you don't know how to do that correctly, or don't have the time to
implement *and test* it, you should just use TCP or some other protocol
stack that has solved the problem already.
matthew at matthew.at
More information about the P2p-hackers