[p2p-hackers] Re: scalability
SIMON Gwendal RD-MAPS-ISS
gwendal.simon at francetelecom.com
Fri Dec 2 09:38:14 UTC 2005
Hi Alexander,
This work is close to the one we perform for Maay [1]. As we just begin to implement it, it could be great if you can participate to the early protocol discussion on the mailing-list.
The current Maay implementation [2] is very open. We develop a basic indexer that communicates through XML-RPC to the "Maay node". The "Maay node" manages communication and the sql database. It can be controlled through a web interface.
Have fun !
-- Gwendal
[1]: MAAY: a decentralized personalized search system, F. Dang Ngoc, J. Keller, G. Simon. SAINT'2006 http://maay.netofpeers.net/documentation/maay_SAINT2006.pdf
[2]: http://maay.netofpeers.net
> -----Message d'origine-----
> De : p2p-hackers-bounces at zgp.org
> [mailto:p2p-hackers-bounces at zgp.org] De la part de Alexander Löser
> Envoyé : vendredi 2 décembre 2005 10:29
> À : Peer-to-peer development.
> Objet : Re: [p2p-hackers] Re: scalability
>
> Hi Adam,
> originally there was a certain type of clustering in the
> beginnings of
> Gnutella (late 90ies) . People communicate its ids mouth to
> mouth or via
> Email or deja news to other people. So in most cases you got Ids from
> people which had at least similar interests, or from people
> where you
> expected some interesting files. Later, due to the overwhelming
> attractiveness of the gnutella application they introduced
> the gtk and
> other bootstrapping alternatives, given you a number of starting
> pointers. However, this starting points a chosen 'randomly',
> so there is
> no longer any clustering by interests.
>
> We (Berlin and Karlsruhe) developed a new protocol (INGA
> Interest based
> Node Grouping Algorithm [1][2]) , that reclusters the network
> based on
> the interests of the peers, without any DHT, only using on an
> unstructured network. Similar to freenet, the network
> topology evolves
> over a while to a so called small world topology, where people with
> similar interests are clustered together. In addition, to
> further speed
> up the clustering process, peers also keep in a local index
> structures
> other peers, that are 'HUBs' in the network, e.g. having a
> high in and
> out degree. Our experiments show, that we significantly outperform
> Gnutella style approaches in messages even in highly volatile
> networks.
>
> Best's Alex
>
> [1] Searching Dynamic Communities with Personal Indexes.
> Löser, Tempich
> et.al 3rd. International Semantic Web Conference, Galway.
> Springer 2005
> http://cis.cs.tu-berlin.de/~aloeser/publications/iswc2005.pdf
> [2] Remindin': Semantic query routing in peer-to-peer
> networks based on
> social metaphors. Tempich et.al. WWW 2004, New York. ACM 2004
> http://**www.aifb.uni-karlsruhe.de/
> Publikationen/showPublikation?publ_id=447
>
> Ronald Wertlen schrieb:
>
> > Hi Adam,
> >
> > perhaps you have not understood my message because you have not
> > noticed the focus on "precision and recall" (i.e. search)
> not the old
> > Distributed DB vs. own DB debate. You have also
> pigeon-holed my email
> > with the DHT crowd (*grin*), it couldn't be further from it!
> >
> > I was arguing in the other direction - which coderman thankfully
> > picked up. Gnutella doesn't structure enough, that's all. Sure
> > Gnutella beats DHTs on search - I base that observation on
> a project I
> > finished last year - a public prototype that used JXTA and
> was honed
> > for search using super-peers [DFN S2S http://s2s.neofonie.de/
> > (German site) - we've moved on some since them ;) ].
> >
> > Gnutella 0.6 (is there a 0.7 protocol, I can't find it?) allows
> > practically anyone to elevate to super-peer, which results
> in a random
> > (power-law distribtion) network. Such a network is not going to
> > perform very well as far as recall and precision are
> concerned, past a
> > certain point. I would be interested to calculate that exact point
> > (but doubting I'll get to it some time soon :-/).
> >
> > HTH.
> >
> > Best regards, Ron
> >
> > PS. seems this thread has driven the original author to reformulate
> > his statment... :-)
> >
> > PPS.
> > In fact, the network is not going to be completely random - it will
> > follow the contours of the internet (distribution of servers,
> > broadband connections, users, etc. is not random). I am not sure if
> > that destroys or supports my argument. Back to the drawing board!
> >
> > We actually need a better internet. [oops there I go getting
> > unspecific again, sorry!! ;-) ]
> >
> >
> >> Message: 4
> >> Date: Wed, 30 Nov 2005 16:42:39 -0500
> >> From: Adam Fisk <afisk at speedymail.org>
> >> Subject: Re: [p2p-hackers] Re: scalability To: "Peer-to-peer
> >> development." <p2p-hackers at zgp.org>
> >> Message-ID: <438E1CCF.4010907 at speedymail.org>
> >> Content-Type: text/plain; charset=ISO-8859-1; format=flowed
> >>
> >> I don't understand your post. When you say "critical", I assume
> >> you're talking about life and death situations? Are you talking
> >> about anything specifically? DHTs have failure rates. Ad hoc and
> >> mesh networks can become useful in emergency situations where
> >> conventional infrastructures break down, but the
> >> centralized/p2p/structured/unstructured questions here are
> far from
> >> obvious.
> >>
> >> On the "obsessive science types" issue, this completely misses the
> >> point. It's a very non "obsessive science type" statement. There
> >> are strong reasons for using the massive indexing/random walk
> >> approach above DHTs -- reasons that have nothing to do with
> >> scalability. In particulary, DHTs are, well, hash tables. Hash
> >> tables don't work well for metadata queries. They do fine for
> >> keywords (hotspots are a problem, but they can be solved),
> but they
> >> aren't as nice a fit for metadata. RDF and DHTs are tough
> to squeeze
> >> together, for example. The massive indexing (mutual index
> caching to
> >> use Serguei's term)/random walk approach can get around
> these issues
> >> more easily. They are also not nearly as brittle as DHTs. Sure,
> >> DHTs repair themselves after node joins and leaves, but node
> >> transience generally has a much greater effect on DHTs
> than it does
> >> on massive indexing networks.
> >>
> >> I also think you're underestimating the efficiency of massive
> >> indexing and random walks. Sure, these networks don't scale
> >> logarithmically, but they do pretty darn well.
> >> I encourage everyone to stay specific with their posts.
> >>
> >> All the Best,
> >>
> >> Adam
> >
> >
> >
> > _______________________________________________
> > p2p-hackers mailing list
> > p2p-hackers at zgp.org
> > http://zgp.org/mailman/listinfo/p2p-hackers
> > _______________________________________________
> > Here is a web page listing P2P Conferences:
> > http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences
> >
>
>
> --
> ___________________________________________________________
>
> Dr. Alexander Löser,
> Technische Universität Berlin,
> CIS, Sekr. EN 7, Einsteinufer 17, 10587 Berlin, GERMANY
> office: +49- 30-314-25556 fax: +49- 30-314-21601
> web: http://cis.cs.tu-berlin.de/~aloeser/
> ___________________________________________________________
>
> _______________________________________________
> p2p-hackers mailing list
> p2p-hackers at zgp.org
> http://zgp.org/mailman/listinfo/p2p-hackers
> _______________________________________________
> Here is a web page listing P2P Conferences:
> http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences
>
More information about the P2p-hackers
mailing list