[p2p-hackers] clustering

Ranus networksimulator at gmail.com
Thu Mar 9 03:24:39 UTC 2006


Daniel:	You are right on Chord. Thanks for pointing that out.

Michael: No universally accepted definition of the scale-free network is
available now. The definition of "scale-free metric" does not require a
power-law degree distribution, but that's 99% the case in available research
and you can say it's the only case that matters. Some deviation in real life
case is reported (as the degree cannot grow ultimately)[1].

Many DHTs provide good connectivity but do not require global visibility,
and usually there's not much need in adding it. We do not really care if
it's scale free or not, we only care if we can find what we want. Under
current condition you cannot really  feel the difference when you're in this
part of the overlay network or another, there's no way to measure. 

But sometimes you do want to connect to certain nodes more than others. Then
starting at a random point and having no idea of the whole picture, how to
find the right part you belong quickly? This does not always mean global
visibility is required, though that does settle the problem. There could be
other approaches,  esp. when the network is large and decentralized. So it's
not only a matter of join and stay somewhere, but choose the right cluster
as well. Could we call this "heuristic clustering"? Again, the question fall
back to the very first one: who do you call as your 'relatives' and how to
ask for them from your current neighbors?


[1] http://www.santafe.edu/research/publications/workingpapers/01-03-021.pdf
--
Ranus Yue
Tsinghua University
 

> -----Original Message-----
> From: p2p-hackers-bounces at zgp.org 
> [mailto:p2p-hackers-bounces at zgp.org] On Behalf Of Daniel Stutzbach
> Sent: Thursday, March 09, 2006 1:16 AM
> To: 'Peer-to-peer development.'
> Subject: Re: [p2p-hackers] clustering
> 
> That's actually a very larger clustering coefficient, 
> compared to random graph with the same number of nodes and edges.
> 
> An Erdos-Renyi random graph (the standard random graph) every 
> edge exists with probability |E| / |V|^2.  Consequently, the 
> clustering coefficient is approximately |E| / |V|^2.  In 
> Chord, |E| = |V|*lg |V|, so we have a clustering coefficient 
> in the comparable random graph of lg |V| / |V|.
> 
> In your calculation, you're using n=2^|V|, so the clustering 
> coefficient is 3/(lg |V| - 2).  This is an enormous 
> clustering coefficient.  For |V| = 1 million, the clustering 
> coefficient is 17%!
> For a comparable random graph, it's only 0.002%.
> 
> (I use "lg x" to mean "log base 2 of x")
> 
> Scale-free networks scale well in some ways and scale very 
> badly in other ways.  Yes, they maintain short path lengths 
> as the network gets bigger.  However, they require that you 
> have these increasingly high-degree peers as the network gets 
> bigger.  
> 
> You may have some peers that have more capacity than others, 
> and it may make sense to utilize those resources, but it does 
> not seem wise to assume that if your network grows by a 
> factor of 100 that you will be able to a user who has 100 
> times more resources than your previous best-user.
> 
> It largely depends on what you're trying to achieve.  For a 
> network like Gnutella, global visibility is not one of the 
> design requirements, so it's OK to use a constant number of edges.
> 
> DHTs (typically) use O(log |V|) edges to guarantee a 
> worst-case of O(log |V|) steps per lookup.  That seems like a 
> good compromise to me.
> 
> -- 
> Daniel Stutzbach                           Computer Science 
> Ph.D Student
> http://www.barsoom.org/~agthorr                     
> University of Oregon
> _______________________________________________
> p2p-hackers mailing list
> p2p-hackers at zgp.org
> http://zgp.org/mailman/listinfo/p2p-hackers
> _______________________________________________
> Here is a web page listing P2P Conferences:
> http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences




More information about the P2p-hackers mailing list