[p2p-hackers] MTU in the real world

David Barrett dbarrett at quinthar.com
Tue May 31 21:38:14 UTC 2005


Matthew -- Thanks for the extensive detail on the origin of the 1500 
number (or actually, 1492).  That helps me understand and have 
confidence in it.

But even given that, have you ever tried to go above it (perhaps in your 
work with Amicima or before)?  What kind of symtoms did you experience?

-david

On Tue, 31 May 2005 6:58 am, Matthew Kaufman wrote:
>>  I've read in multiple places that it's best to have a UDP MTU
>>  of under 1500 bytes.  However, it sounds like most of this is
>>  based on theoretical analysis, and not on real-world experience.
>
> It is best to not send traffic that needs to be fragmented, because:
>
> 1) Fragment creation is expensive at the point it occurs
> 2) Reconstruction of fragmented packets is expensive at the receiver
> 3) (most important) loss of any fragment causes loss of the entire 
> packet,
> which magnifies the effect of packet loss (imagine that you lose 1 in 
> every
> 10 packets on your path... That's 10% loss if you're sending one-piece
> packets, but 100% packet loss if you're sending things that get broken 
> up
> into enough pieces that 1 in 10 fragments lost causes every packet to 
> be
> missing at least one fragment)
>
> Given this, there's a couple of options...
>
>>  With this in mind, have you tried using a MTU bigger than
>>  1500 bytes and been bitten by it?
>
> It is hard to tell when you've been "bitten by it", because 
> fragmentation
> occurs, packets make it to the far end, and unless you're really paying
> attention you don't see the CPU costs or the increased apparent loss 
> rates.
> I'm *sure* there are applications written by people who didn't consider
> fragmentation which are out there, "work ok", and yet could be a lot 
> better.
>
>>  Basically, do you know of
>>  any emperical analysis (of any level of formality) of a
>>  real-world UDP application that supports or refutes the 1500
>>  byte rule of thumb?
>
> The right number of bytes to use for maximum transfer efficiency is the
> smallest link MTU on any hop you're travelling through. Bigger, and you
> fragment (bad). Smaller, and you're paying more in headers than the 
> optimal
> case (not that bad, but something to consider).
>
> There is one case (avoiding head-of-line blocking) where you actually 
> do
> want to use significantly less than the MTU... Matters for multimedia
> applications on slower or more congested links.
>
> Now, how do you determine the smallest link MTU on any hop you're 
> travelling
> through?
>
> There's two good ways... Path MTU Discovery, which sends packets of
> increasing size with the "DF" (Don't Fragment) bit set, until they stop
> getting through or start causing ICMP Destination Unreachable 
> (Fragmentation
> Required) messages, then backs down (usually with a "logical" size
> progression, making good guesses about "likely" MTUs... Size of a FDDI
> frame, size of an Ethernet frame, etc). Breaks horribly if you 
> encounter a
> firewall that blocks the Fragmentation Required ICMP messages. (Typical
> cause of "my browser + operating system doesn't work with this bank web 
> site
> when they send my transaction history but I could log in ok" kinds of 
> bugs,
> or "I can't reach this bank when I'm using my VPN tunnel" kinds of 
> bugs, if
> you're an ISP tech support person)
>
> Or "a good guess"... Almost everyone on broadband connections is 
> connected
> to their broadband router with Ethernet or Wireless Ethernet. Even if 
> they
> aren't, the router at the far end is connected directly or indirectly 
> to one
> of the upstream routers with Ethernet, and a whole lot of peering 
> happens
> over Ethernet. Almost nobody is using smaller-than-Ethernet MTUs in 
> their
> core networks (because they don't want things with broken Path MTU 
> Discovery
> (eg., because there's firewalls blocking ICMP at the ends) to break 
> because
> of *them*), but all the hosts are connected with Ethernet, so that's a 
> good
> argument for 1500. However, I would strongly suggest that you consider 
> the
> common tunnel cases, and use 1492 or smaller: 1492 is Ethernet payload
> (1500) minus PPPoE header (8). GRE can add 28 or 36 bytes of overhead, 
> IPIP
> adds 20, IPsec tunnels add 28, L2TP adds 12. Imagine cases like L2TP 
> tunnel
> over PPPoE, and adjust accordingly.
>
> And remember to add back the IP (typically 20) and UDP (8) header size 
> when
> you're working out how big your maximum *payload* size is to ensure the
> maximum packet size you're looking for.
>
> Also remember to round down to your encryption block size if, like us,
> you're encrypting the entire payload.
>
>>
>>  Furthermore, I've read that if you "connect" your UDP socket
>>  to the remote side and then start sending large packets and
>>  backing off slowly, the socket layer will compute the "real"
>>  MTU between two endpoints, and you can obtain it through
>>  "getsockopt".  Do you know of anyone who's tried this, and
>>  the results?
>
> I'm not aware of any operating system that lets you access Path MTU
> Discovery like this, but I suppose it might exist. Certainly there are 
> some
> which do it for TCP sessions.
>
> Matthew Kaufman
> matthew at matthew.at
> www.amicima.com
>
>
> _______________________________________________
> p2p-hackers mailing list
> p2p-hackers at zgp.org
> http://zgp.org/mailman/listinfo/p2p-hackers
> _______________________________________________
> Here is a web page listing P2P Conferences:
> http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences



More information about the P2p-hackers mailing list