[p2p-hackers] Re: scalability

SIMON Gwendal RD-MAPS-ISS gwendal.simon at francetelecom.com
Fri Dec 2 09:38:14 UTC 2005


Hi Alexander,

	This work is close to the one we perform for Maay [1]. As we just begin to implement it, it could be great if you can participate to the early protocol discussion on the mailing-list.

	The current Maay implementation [2] is very open. We develop a basic indexer that communicates through XML-RPC to the "Maay node". The "Maay node" manages communication and the sql database. It can be controlled through a web interface.

	Have fun !

-- Gwendal


[1]: MAAY: a decentralized personalized search system,  F. Dang Ngoc, J. Keller, G. Simon. SAINT'2006 http://maay.netofpeers.net/documentation/maay_SAINT2006.pdf

[2]: http://maay.netofpeers.net

 

> -----Message d'origine-----
> De : p2p-hackers-bounces at zgp.org 
> [mailto:p2p-hackers-bounces at zgp.org] De la part de Alexander Löser
> Envoyé : vendredi 2 décembre 2005 10:29
> À : Peer-to-peer development.
> Objet : Re: [p2p-hackers] Re: scalability
> 
> Hi Adam,
> originally there was a certain type of clustering in the 
> beginnings of 
> Gnutella (late 90ies) . People communicate its ids mouth to 
> mouth or via 
> Email or deja news to other people. So in most cases you got Ids from 
> people which had  at least similar interests, or from people 
> where you 
> expected some interesting files. Later, due to the overwhelming 
> attractiveness of the gnutella application they introduced 
> the gtk and 
> other bootstrapping alternatives, given you a number of starting 
> pointers. However, this starting points a chosen 'randomly', 
> so there is 
> no longer any clustering by interests.
> 
> We (Berlin and Karlsruhe) developed a new protocol (INGA 
> Interest based 
> Node Grouping Algorithm [1][2]) , that reclusters the network 
> based on 
> the interests of the peers, without any DHT, only using on an 
> unstructured network.  Similar to freenet, the network 
> topology evolves 
> over a while to a so called small world topology, where people with 
> similar interests are clustered together. In addition, to 
> further speed 
> up the clustering process, peers also keep in a local index 
> structures 
> other peers, that are 'HUBs' in the network, e.g. having a 
> high in and 
> out degree. Our experiments show, that we significantly outperform 
> Gnutella style approaches in messages even in highly volatile 
> networks.
> 
> Best's Alex
> 
> [1] Searching Dynamic Communities with Personal Indexes. 
> Löser, Tempich 
> et.al   3rd. International Semantic Web Conference, Galway. 
> Springer 2005
> http://cis.cs.tu-berlin.de/~aloeser/publications/iswc2005.pdf
> [2] Remindin': Semantic query routing in peer-to-peer 
> networks based on 
> social metaphors. Tempich et.al. WWW 2004, New York. ACM 2004
> http://**www.aifb.uni-karlsruhe.de/ 
> Publikationen/showPublikation?publ_id=447
> 
> Ronald Wertlen schrieb:
> 
> > Hi Adam,
> >
> > perhaps you have not understood my message because you have not 
> > noticed the focus on "precision and recall" (i.e. search) 
> not the old 
> > Distributed DB vs. own DB debate. You have also 
> pigeon-holed my email 
> > with the DHT crowd (*grin*), it couldn't be further from it!
> >
> > I was arguing in the other direction - which coderman thankfully 
> > picked up.  Gnutella doesn't structure enough, that's all. Sure 
> > Gnutella beats DHTs on search - I base that observation on 
> a project I 
> > finished last year - a public prototype that used JXTA and 
> was honed 
> > for search using super-peers   [DFN S2S http://s2s.neofonie.de/ 
> > (German site) - we've moved on some since them  ;) ].
> >
> > Gnutella 0.6 (is there a 0.7 protocol, I can't find it?) allows 
> > practically anyone to elevate to super-peer, which results 
> in a random 
> > (power-law distribtion) network. Such a network is not going to 
> > perform very well as far as recall and precision are 
> concerned, past a 
> > certain point. I would be interested to calculate that exact point 
> > (but doubting I'll get to it some time soon :-/).
> >
> > HTH.
> >
> > Best regards, Ron
> >
> > PS. seems this thread has driven the original author to reformulate 
> > his statment...  :-)
> >
> > PPS.
> > In fact, the network is not going to be completely random - it will 
> > follow the contours of the internet (distribution of servers, 
> > broadband connections, users, etc. is not random). I am not sure if 
> > that destroys or supports my argument. Back to the drawing board!
> >
> > We actually need a better internet. [oops there I go getting 
> > unspecific again, sorry!!  ;-) ]
> >
> >
> >> Message: 4
> >> Date: Wed, 30 Nov 2005 16:42:39 -0500
> >> From: Adam Fisk <afisk at speedymail.org>
> >> Subject: Re: [p2p-hackers] Re: scalability  To: "Peer-to-peer 
> >> development." <p2p-hackers at zgp.org>
> >> Message-ID: <438E1CCF.4010907 at speedymail.org>
> >> Content-Type: text/plain; charset=ISO-8859-1; format=flowed
> >>
> >> I don't understand your post.  When you say "critical", I assume 
> >> you're talking about life and death situations?  Are you talking 
> >> about anything specifically?  DHTs have failure rates.  Ad hoc and 
> >> mesh networks can become useful in emergency situations where 
> >> conventional infrastructures break down, but the 
> >> centralized/p2p/structured/unstructured questions here are 
> far from 
> >> obvious.
> >>
> >> On the "obsessive science types" issue, this completely misses the 
> >> point.  It's a very non "obsessive science type" statement.  There 
> >> are strong reasons for using the massive indexing/random walk 
> >> approach above DHTs -- reasons that have nothing to do with 
> >> scalability.  In particulary, DHTs are, well, hash tables.  Hash 
> >> tables don't work well for metadata queries.  They do fine for 
> >> keywords (hotspots are a problem, but they can be solved), 
> but they 
> >> aren't as nice a fit for metadata.  RDF and DHTs are tough 
> to squeeze 
> >> together, for example.  The massive indexing (mutual index 
> caching to 
> >> use Serguei's term)/random walk approach can get around 
> these issues 
> >> more easily.  They are also not nearly as brittle as DHTs.  Sure, 
> >> DHTs repair themselves after node joins and leaves, but node 
> >> transience generally has a much greater effect on DHTs 
> than it does 
> >> on massive indexing networks.
> >>
> >> I also think you're underestimating the efficiency of massive 
> >> indexing and random walks.  Sure, these networks don't scale 
> >> logarithmically, but they do pretty darn well.
> >> I encourage everyone to stay specific with their posts.
> >>
> >> All the Best,
> >>
> >> Adam
> >
> >
> >
> > _______________________________________________
> > p2p-hackers mailing list
> > p2p-hackers at zgp.org
> > http://zgp.org/mailman/listinfo/p2p-hackers
> > _______________________________________________
> > Here is a web page listing P2P Conferences:
> > http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences
> >
> 
> 
> -- 
> ___________________________________________________________
> 
>   Dr. Alexander Löser, 
>   Technische Universität Berlin,
>   CIS, Sekr. EN 7, Einsteinufer 17, 10587 Berlin, GERMANY
>   office: +49- 30-314-25556  fax: +49- 30-314-21601
>   web: http://cis.cs.tu-berlin.de/~aloeser/	
> ___________________________________________________________
> 
> _______________________________________________
> p2p-hackers mailing list
> p2p-hackers at zgp.org
> http://zgp.org/mailman/listinfo/p2p-hackers
> _______________________________________________
> Here is a web page listing P2P Conferences:
> http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences
> 



More information about the P2p-hackers mailing list