[p2p-hackers] MTU in the real world
Matthew Kaufman
matthew at matthew.at
Tue May 31 13:21:27 UTC 2005
> I've read in multiple places that it's best to have a UDP MTU
> of under 1500 bytes. However, it sounds like most of this is
> based on theoretical analysis, and not on real-world experience.
It is best to not send traffic that needs to be fragmented, because:
1) Fragment creation is expensive at the point it occurs
2) Reconstruction of fragmented packets is expensive at the receiver
3) (most important) loss of any fragment causes loss of the entire packet,
which magnifies the effect of packet loss (imagine that you lose 1 in every
10 packets on your path... That's 10% loss if you're sending one-piece
packets, but 100% packet loss if you're sending things that get broken up
into enough pieces that 1 in 10 fragments lost causes every packet to be
missing at least one fragment)
Given this, there's a couple of options...
> With this in mind, have you tried using a MTU bigger than
> 1500 bytes and been bitten by it?
It is hard to tell when you've been "bitten by it", because fragmentation
occurs, packets make it to the far end, and unless you're really paying
attention you don't see the CPU costs or the increased apparent loss rates.
I'm *sure* there are applications written by people who didn't consider
fragmentation which are out there, "work ok", and yet could be a lot better.
> Basically, do you know of
> any emperical analysis (of any level of formality) of a
> real-world UDP application that supports or refutes the 1500
> byte rule of thumb?
The right number of bytes to use for maximum transfer efficiency is the
smallest link MTU on any hop you're travelling through. Bigger, and you
fragment (bad). Smaller, and you're paying more in headers than the optimal
case (not that bad, but something to consider).
There is one case (avoiding head-of-line blocking) where you actually do
want to use significantly less than the MTU... Matters for multimedia
applications on slower or more congested links.
Now, how do you determine the smallest link MTU on any hop you're travelling
through?
There's two good ways... Path MTU Discovery, which sends packets of
increasing size with the "DF" (Don't Fragment) bit set, until they stop
getting through or start causing ICMP Destination Unreachable (Fragmentation
Required) messages, then backs down (usually with a "logical" size
progression, making good guesses about "likely" MTUs... Size of a FDDI
frame, size of an Ethernet frame, etc). Breaks horribly if you encounter a
firewall that blocks the Fragmentation Required ICMP messages. (Typical
cause of "my browser + operating system doesn't work with this bank web site
when they send my transaction history but I could log in ok" kinds of bugs,
or "I can't reach this bank when I'm using my VPN tunnel" kinds of bugs, if
you're an ISP tech support person)
Or "a good guess"... Almost everyone on broadband connections is connected
to their broadband router with Ethernet or Wireless Ethernet. Even if they
aren't, the router at the far end is connected directly or indirectly to one
of the upstream routers with Ethernet, and a whole lot of peering happens
over Ethernet. Almost nobody is using smaller-than-Ethernet MTUs in their
core networks (because they don't want things with broken Path MTU Discovery
(eg., because there's firewalls blocking ICMP at the ends) to break because
of *them*), but all the hosts are connected with Ethernet, so that's a good
argument for 1500. However, I would strongly suggest that you consider the
common tunnel cases, and use 1492 or smaller: 1492 is Ethernet payload
(1500) minus PPPoE header (8). GRE can add 28 or 36 bytes of overhead, IPIP
adds 20, IPsec tunnels add 28, L2TP adds 12. Imagine cases like L2TP tunnel
over PPPoE, and adjust accordingly.
And remember to add back the IP (typically 20) and UDP (8) header size when
you're working out how big your maximum *payload* size is to ensure the
maximum packet size you're looking for.
Also remember to round down to your encryption block size if, like us,
you're encrypting the entire payload.
>
> Furthermore, I've read that if you "connect" your UDP socket
> to the remote side and then start sending large packets and
> backing off slowly, the socket layer will compute the "real"
> MTU between two endpoints, and you can obtain it through
> "getsockopt". Do you know of anyone who's tried this, and
> the results?
I'm not aware of any operating system that lets you access Path MTU
Discovery like this, but I suppose it might exist. Certainly there are some
which do it for TCP sessions.
Matthew Kaufman
matthew at matthew.at
www.amicima.com
More information about the P2p-hackers
mailing list