[p2p-hackers] Re: scalability

Ronald Wertlen rrrw at neofonie.de
Sat Dec 3 23:04:16 UTC 2005


Hi Daniel,

these are basically benchmark domains (variables), that tell you how 
good your search is from, as I mentioned in my mail, the information 
retrieval field.  http://en.wikipedia.org/wiki/Information_retrieval

For instance Bloom Filters increase your scalability but reduce the 
precision of the search - so you get a lot of stuff you didn't want.

A few years ago, a lot of papers in the p2p field that were working on 
stuff like topology, organisational methods, scalability, etc. 
concentrated on finding better ways of getting from object_id to the 
node (number of hops, number of lookups, etc.). The problem from an IR 
perspective is that not all objects are as "simple" as a mp3 file and 
not all searches are as simple as "coldplay", how do you get the 
onject_id in the first place. This becomes a severe problem the more 
complex the objects, their metadata and the queries (for instance 
Boolean, range, content proximity, queries).

I've downloaded your paper, thanks for the refutation. I love results 
that seem counter-intuitive to me because they mean I have some learning 
to do.  :-)

Best regards, Ron

> From: p2p-hackers-bounces at zgp.org [mailto:p2p-hackers-bounces at zgp.org] On
> Behalf Of Daniel Stutzbach
> Sent: Thursday, December 01, 2005 3:52 PM
> To: p2p-hackers at zgp.org
> Subject: Re: [p2p-hackers] Re: scalability
> 
> On Thu, Dec 01, 2005 at 09:48:45PM +0100, Ronald Wertlen wrote:
> 
>>> Gnutella 0.6 (is there a 0.7 protocol, I can't find it?) allows 
>>> practically anyone to elevate to super-peer, which results in a random 
>>> (power-law distribtion) network.
> 
> 
> Gnutella is not a power-law network.  See my paper on the graph properties
> of Gnutella, presented at the Internet Measurement Conference earlier this
> year:
> 
> http://www.usenix.org/events/imc05/tech/stutzbach.html
> 
>>> Such a network is not going to perform very well as far as recall and 
>>> precision are concerned, past a certain point. I would be interested 
>>> to calculate that exact point (but doubting I'll get to it some time 
>>> soon :-/).
> 
> 
> Could you rigorously define recall and precision for me?  I'm not sure what
> you mean by these terms.
> 
> -- Daniel Stutzbach Computer Science Ph.D Student http://www.barsoom.org/~agthorr University of Oregon 




More information about the P2p-hackers mailing list