[p2p-hackers] Rumorama: Scalable Summary Based Retrieval in P2P Networks

Wolfgang Müller wolfgang.mueller at wiai.uni-bamberg.de
Tue May 17 12:08:13 UTC 2005


Dear colleagues,

In the tech report 
http://mi.wiai.uni-bamberg.de/research/publications/bibtex/mueller2005scalable 
we present Rumorama, a protocol to make summary-based retrieval in P2P 
networks scalable. 

Summary-based retrieval in P2P networks, as introduced by PlanetP, uses 
summaries of peer data to select peers to contact for a retrieval process. In 
other words, PlanetP adapts GlOSS-like distributed IR techniques to P2P 
networks. The advantage of this method is that only very little indexing data 
has to be shipped as compared to distributed indexing structures, giving room  
for redundant storage of these summaries. Furthermore, summary-based 
retrieval needs to ship orders of magnitude less data in some situations, 
e.g. when processing multi-word boolean queries, than methods based on 
inverted lists (see e.g. the Li et al. paper on the feasibility of keyword 
search in P2P-networks for some words about the problems to be solved for 
inverted file based indexing 
http://pdos.csail.mit.edu/~rtm/papers/search_feasibility.ps). The drawback of 
summary-based retrieval is - up to Rumorama - that PlanetP is not scalable.

Rumorama achieves scalability: the number of neighbors of each peer grows 
logarithmically with the number of peers in the network, the communication 
cost per peer for maintaining the network also grows logarithmically with 
network size. Lastly, queries are processed in a number of hops (i.e. depth 
of our multicast tree) that grows logarithmically with the network's size. 

However, the number of nodes to be contacted within a query grows linearly 
with the network size. This is a known effect from distributed information 
retrieval. However, depending on the query model, the absolute number of 
peers that has to be contacted for answering a query is very small with 
respect to network size.

Enjoy the paper! We would be thankful for some feedback and criticism on it.

Cheers,
Wolfgang Müller
Martin Eisenhardt
Andreas Henrich
-- 
Dr. Wolfgang Müller
LS Medieninformatik
Universität Bamberg



More information about the P2p-hackers mailing list