[p2p-hackers] Re: scalability

Bryan Turner bryan.turner at pobox.com
Fri Dec 2 20:15:45 UTC 2005


My $.02 on Gnutella,

	The Gnutella network will scale fine to 2B nodes.  However, I
believe without interest clustering or intelligent peer selection, it will
become increasingly difficult to find the data you are interested in.  IE: I
feel the current architecture misses the 'long tail'.  (Note that I am not
well versed on Gnutella architecture, this opinion is based on papers
modeling the math behind Gnutella)

	I like to find the orthogonal axis in a design, P2P has lots of
interesting scalability axis:
1	Scalability in # of nodes
2	Scalability in # of objects
3	Scalability in size of objects
4	Scalability in interest for an object (hot spots)
5	Scalability in bandwidth (protocol overhead, efficiency)
	etc.

	BitTorrent captures all but #2, as multiple torrents may require
redundant connections to a peer, and torrents that share files cannot also
share swarms (not to mention BitTorrent isn't a content search network).

Gnutella (I believe) doesn't meet #2,3 and partially #4,5:
	#2 because it does not cluster related data it will eventually
		be overwhelmed with content.
	#3 because it performs full-file transfers instead of block
		exchanges or partial file transfers
	#4/5 because clients don't immediately offer partial downloads,
		thus hot spots have a congestion delay measured in
		full-file-transfer increments rather than in block
		increments (an order of 2 for typical MP3s, easily
		reaching multiple days of congestion).

	A vision for a network that scales along all axis would be Gnutella
with some structure to improve domain-specific searches, with BitTorrent as
the data transfer mechanism.

Please educate me if I've missed some facet of Gnutella!
--Bryan
bryan.turner at pobox.com

-----Original Message-----
From: p2p-hackers-bounces at zgp.org [mailto:p2p-hackers-bounces at zgp.org] On
Behalf Of Daniel Stutzbach
Sent: Thursday, December 01, 2005 3:52 PM
To: p2p-hackers at zgp.org
Subject: Re: [p2p-hackers] Re: scalability

On Thu, Dec 01, 2005 at 09:48:45PM +0100, Ronald Wertlen wrote:
> Gnutella 0.6 (is there a 0.7 protocol, I can't find it?) allows 
> practically anyone to elevate to super-peer, which results in a random 
> (power-law distribtion) network.

Gnutella is not a power-law network.  See my paper on the graph properties
of Gnutella, presented at the Internet Measurement Conference earlier this
year:

http://www.usenix.org/events/imc05/tech/stutzbach.html

> Such a network is not going to perform very well as far as recall and 
> precision are concerned, past a certain point. I would be interested 
> to calculate that exact point (but doubting I'll get to it some time 
> soon :-/).

Could you rigorously define recall and precision for me?  I'm not sure what
you mean by these terms.

-- 
Daniel Stutzbach                           Computer Science Ph.D Student
http://www.barsoom.org/~agthorr                     University of Oregon



More information about the P2p-hackers mailing list