[p2p-hackers] Tighter HTTP and P2P integration??

Karl A. Magdsick kmagdsick at limewire.com
Wed Feb 15 02:00:34 UTC 2006


Charles Iliya Krempeaux wrote:
[snip]

>Would this be a more direct link?: http://plone.org/products/plonetramline
>  
>

No.  Tramline is just a somewhat clever hack to prevent the Zope webserver
from storing large files as objects in its object database.  If you're 
not familiar with
Zope, looking at Tramilne will likely just confuse you.  Tramline was 
involved with
my work only in a round-about way.

>>I've done some preliminary work (along with Matt Hamilton from NetSight)
>>using an Apache plugin to add the X-Alt and X-Node HTTP headers
>>that allow Gnutella clients downloading the same file to find each other
>>and take load off the server.
>>    
>>
>
>(I know I could probably read the code to get this info, but I thought
>it might be easier to just ask, so....)  Could you explain the
>semantics and usage of X-Alt and X-Node more.  As well as elaborate
>more on how all this works, please.
>  
>

First, a short bit about the relationship between Gnutella and HTTP:

Gnutella uses HTTP to transfer all file data.  This is a huge advantage
for integration with webservers, as Gnutella clients can treat the webserver
as just another Gnutella client.  Many P2P networks invent their own
file-transfer protocols, but the overhead of HTTP isn't very large, and
the ability to pretend that regular webservers are peers is a huge win.

Optional HTTP headers are used to exchange information so that Gnutella
clients that are downloading the same file can form a "download mesh" -- a
set of clients that share and share alike chunks of a file they are all 
trying to download.

Once you've installed a webserver plugin to allow the webserver to 
understand
and send a very small number of optional headers, the webserver can be
thought of as a special case of Gnutella client: one that can't search 
or download,
but one that is capable of uploading files, helping coordinate download 
meshes,
and participating in download meshes.  It's all just HTTP with a handful of
optional headers.

The Gnutella protocol itself is only necessary for searching, and for
getting files from hosts that are unable to punch holes in NAT.  It's a fair
assumption that your webserver isn't behind NAT, or you've punched a hole
in NAT for your webserver.  In this case, the webserver only needs to speak
the Gnutella protocol if you want your webserver to respond to Gnutella
searches or you want some kind of web interface for searching Gnutella.

Facilitating a Gnutella mesh through a webserver is very simple, much
simpler than implementing a BitTorrent tracker.  There is nearly no 
extra intelligence
required in the webserver, and the clients all treat the webserver as 
just another
member of the download mesh.


Next, a bit about the headers:

X-Node:  a Gnutella client that's able to punch through NAT/firewalls 
and get
an externally contactable IP address will send its external IP and 
Gnutella port
number in this header.

X-Alt: a Gnutella client will send IP:port pairs of Gnutella clients in 
the download
mesh using the X-Alt header.  If a client is able to punch through NAT, 
it'll
include itself in the X-Alt list once it is sharing part of the file.

X-NAlt: this header contains a list of "bad" alternate locations.  In 
essence, this
header says "you gave me some bad entries in an X-Alt header.  stop sending
out the following IP:port pairs".

X-FAlt: this is the X-Alt header, but for mesh members that are unable 
to punch
through NAT/firewalls.  The entries contain information about which proxies
to use to contact these firewalled clients.

X-NFAlt: this is the firewalled version of X-NAlt.


How this all works:

For each file to be downloaded, the webserver remembers information that 
it gets
in X-Alt headers, and spits the same information back out in X-Alt 
headers.  It purges
X-Alt entries that it sees in X-NAlt headers.  It does the same thing 
for X-FAlt and
X-NFAlt headers.  The X-Node information may be useful in deciding which
entries in its internal X-Alt pool should be sent out.


>(My guess is that the client sends an X-Alt... and then the server
>responds with a X-Node, probably giving a list of nodes.  But please
>elaborate.)
>  
>

Unlike semi-centralized systems like Napster or BitTorrent, there is no 
distinction
between client and server in Gnutella.  X-Node is for "this is me", and 
X-Alt is for
"you may want to talk to these".  Both the "client" and "server" side 
send the X-Alt
headers.  Presumably, the "client" side knows how it contacted the 
"server" and therefore
knows how the "server" is externally contactable, so it's useless for 
the "server" to send X-Node.

>>I haven't gone past proof-of-concept, but I have confirmed that two
>>LimeWire clients are able to find each other through an Apache
>>server using a plugin I helped write.  The two LimeWire clients
>>then share portions of the file with each other and thereby reduce
>>load on the server.
>>    
>>
>
>If I remember the Gnutella protocol correctly... Do the 2 clients
>still share this data via HTTP?  And it is a normally HTTP
>download?... or is BitTorrent being employed for this?
>  
>
It's pure HTTP.  There's no need to thrown in some other protocol.


I hope this helps,

Karl Magdsick
Software Engineer, Lime Wire LLC




More information about the P2p-hackers mailing list