[p2p-hackers] Rumorama: Scalable Summary Based Retrieval in P2P
Networks
Wolfgang Müller
wolfgang.mueller at wiai.uni-bamberg.de
Tue May 17 12:08:13 UTC 2005
Dear colleagues,
In the tech report
http://mi.wiai.uni-bamberg.de/research/publications/bibtex/mueller2005scalable
we present Rumorama, a protocol to make summary-based retrieval in P2P
networks scalable.
Summary-based retrieval in P2P networks, as introduced by PlanetP, uses
summaries of peer data to select peers to contact for a retrieval process. In
other words, PlanetP adapts GlOSS-like distributed IR techniques to P2P
networks. The advantage of this method is that only very little indexing data
has to be shipped as compared to distributed indexing structures, giving room
for redundant storage of these summaries. Furthermore, summary-based
retrieval needs to ship orders of magnitude less data in some situations,
e.g. when processing multi-word boolean queries, than methods based on
inverted lists (see e.g. the Li et al. paper on the feasibility of keyword
search in P2P-networks for some words about the problems to be solved for
inverted file based indexing
http://pdos.csail.mit.edu/~rtm/papers/search_feasibility.ps). The drawback of
summary-based retrieval is - up to Rumorama - that PlanetP is not scalable.
Rumorama achieves scalability: the number of neighbors of each peer grows
logarithmically with the number of peers in the network, the communication
cost per peer for maintaining the network also grows logarithmically with
network size. Lastly, queries are processed in a number of hops (i.e. depth
of our multicast tree) that grows logarithmically with the network's size.
However, the number of nodes to be contacted within a query grows linearly
with the network size. This is a known effect from distributed information
retrieval. However, depending on the query model, the absolute number of
peers that has to be contacted for answering a query is very small with
respect to network size.
Enjoy the paper! We would be thankful for some feedback and criticism on it.
Cheers,
Wolfgang Müller
Martin Eisenhardt
Andreas Henrich
--
Dr. Wolfgang Müller
LS Medieninformatik
Universität Bamberg
More information about the P2p-hackers
mailing list