From threelions0916 at yahoo.com.cn Thu Dec 1 06:34:37 2005 From: threelions0916 at yahoo.com.cn (Michael Liu) Date: Sat Dec 9 22:13:05 2006 Subject: [p2p-hackers] P2P in SFC References: <438B8E20.3030903@quinthar.com><20051129011547.34494.qmail@web53605.mail.yahoo.com> <40513.67.188.193.83.1133228692.squirrel@webmail.redswoosh.net> Message-ID: <002301c5f641$4ee31820$3118080a@cnc.intra> eW91IGd1eXMgYXJlIHNvIGx1Y2t5LCBJIHdhbnQgdG8gYXR0ZW5kIHRoZSBtZWV0aW5nIHRvbywg YnV0IEkgYW0gaW4gY2hpbmEgOi0oDQoNCk1pY2hhZWwNCg0KLS0tLS0gT3JpZ2luYWwgTWVzc2Fn ZSAtLS0tLSANCkZyb206ICJUcmF2aXMgS2FsYW5pY2siIDx0cmF2aXNAcmVkc3dvb3NoLm5ldD4N ClRvOiAiUGVlci10by1wZWVyIGRldmVsb3BtZW50LiIgPHAycC1oYWNrZXJzQHpncC5vcmc+DQpT ZW50OiBUdWVzZGF5LCBOb3ZlbWJlciAyOSwgMjAwNSA5OjQ0IEFNDQpTdWJqZWN0OiBSZTogW3Ay cC1oYWNrZXJzXSBQMlAgaW4gU0ZDDQoNCg0KPiBMZW1vbiwgSSBhZ3JlZSB3aXRoIHlvdS4gIFNp bmNlIG1vc3QgcGVvcGxlIHNlZW0gdG8gYmUgYWJsZSB0byBtYWtlIHRoZQ0KPiBXZWRuZXNkYXkg dGltZSwgbWF5YmUgd2Ugc2hvdWxkIGZpbmFsaXplIG9uIHRoYXQuDQo+IA0KPiBEYXZpZCwgZG8g d2UgaGF2ZSBhIGNvbnNlbnN1cz8NCj4gDQo+IA0KPiBUDQo+IA0KPiBMZW1vbiBPYnJpZW4gc2Fp ZDoNCj4gPiBJIGNhbiBjb21lIGFueXRpbWUuLi5JIHRoaW5rIHRoaXMgd291bGQgYmUgbmVhdDsg aSBkb24ndCBrbm93IGFib3V0IHlvdQ0KPiA+IGd1eXM7IGJ1dCBpIG93biBteSBvd24gY29tcGFu eTsgY29taW5nIG91dCB3aXRoIGEgcHJvZHVjdCBzb29uLiBJIGtub3cNCj4gPiBkYXZpZCBoYXMg aUdsYW5jZS4uLndoaWNoIGlzIG5vdCBpbiBteSBzcGFjZSwgYnV0IGJlbGlldmUgbWVldGluZyBv dGhlcg0KPiA+IGxpa2UgbWluZGVkIHBlb3BsZSB3aG8ga25vdyAicGVlciIgaXMgdGhlIG5leHQg YmlnIHRoaW5nLi4uc29ycnkgbXkNCj4gPiBmcmllbmRzOyBidXQgdGhlIHdlYiBpcyBwbGF5ZWQg b3V0Lg0KPiA+DQo+ID4gICBlbA0KPiA+DQo+ID4gRGF2aWQgQmFycmV0dCA8ZGJhcnJldHRAcXVp bnRoYXIuY29tPiB3cm90ZToNCj4gPiAgIFdoYXQgZGF5L3RpbWUgd291bGQgeW91IHByb3Bvc2U/ DQo+ID4NCj4gPiBTZXJndWVpIE9zb2tpbmUgd3JvdGU6DQo+ID4+Pk1heWJlLCBXZWRuZXNkYXks IDlwbT8NCj4gPj4NCj4gPj4NCj4gPj4gU29ycnkgLSBJJ20gYnVzeSBvbiBXZWRuZXNkYXkuLi4N Cj4gPj4NCj4gPj4gLS0tLS1PcmlnaW5hbCBNZXNzYWdlLS0tLS0NCj4gPj4gRnJvbTogcDJwLWhh Y2tlcnMtYm91bmNlc0B6Z3Aub3JnIFttYWlsdG86cDJwLWhhY2tlcnMtYm91bmNlc0B6Z3Aub3Jn XU9uDQo+ID4+IEJlaGFsZiBPZiBEYXZpZCBCYXJyZXR0DQo+ID4+IFNlbnQ6IE1vbmRheSwgTm92 ZW1iZXIgMjgsIDIwMDUgMTo0MCBQTQ0KPiA+PiBUbzogUGVlci10by1wZWVyIGRldmVsb3BtZW50 Lg0KPiA+PiBTdWJqZWN0OiBbcDJwLWhhY2tlcnNdIFAyUCBpbiBTRkMNCj4gPj4NCj4gPj4NCj4g Pj4gU28gbG9va3MgbGlrZSB0aGVyZSdzIGEgZGVjZW50IHNob3dpbmcgb2YgUDJQIGd1eXMgaW4g U2FuIEZyYW5jaXNjbyAtLQ0KPiA+PiBzaXggYnkgbXkgY291bnQuIEhvdyBhYm91dCBzdXNoaSBh bmQgYmVlciB0aGlzIHdlZWsgYXQsIHNheSBSeW9rbz8NCj4gPj4NCj4gPj4gaHR0cDovL3Rpbnl1 cmwuY29tL2JrazVkDQo+ID4+DQo+ID4+IE1heWJlLCBXZWRuZXNkYXksIDlwbT8gQW55IG9iamVj dGlvbnMgb3IgYWZmaXJtYXRpb25zPw0KPiA+Pg0KPiA+PiAtZGF2aWQNCj4gPj4NCj4gPj4NCj4g Pj4NCj4gPj4gX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX18N Cj4gPj4gcDJwLWhhY2tlcnMgbWFpbGluZyBsaXN0DQo+ID4+IHAycC1oYWNrZXJzQHpncC5vcmcN Cj4gPj4gaHR0cDovL3pncC5vcmcvbWFpbG1hbi9saXN0aW5mby9wMnAtaGFja2Vycw0KPiA+PiBf X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fXw0KPiA+PiBIZXJl IGlzIGEgd2ViIHBhZ2UgbGlzdGluZyBQMlAgQ29uZmVyZW5jZXM6DQo+ID4+IGh0dHA6Ly93d3cu bmV1cm9ncmlkLm5ldC90d2lraS9iaW4vdmlldy9NYWluL1BlZXJUb1BlZXJDb25mZXJlbmNlcw0K PiA+PiBfX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fXw0KPiA+ PiBwMnAtaGFja2VycyBtYWlsaW5nIGxpc3QNCj4gPj4gcDJwLWhhY2tlcnNAemdwLm9yZw0KPiA+ PiBodHRwOi8vemdwLm9yZy9tYWlsbWFuL2xpc3RpbmZvL3AycC1oYWNrZXJzDQo+ID4+IF9fX19f X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fDQo+ID4+IEhlcmUgaXMg YSB3ZWIgcGFnZSBsaXN0aW5nIFAyUCBDb25mZXJlbmNlczoNCj4gPj4gaHR0cDovL3d3dy5uZXVy b2dyaWQubmV0L3R3aWtpL2Jpbi92aWV3L01haW4vUGVlclRvUGVlckNvbmZlcmVuY2VzDQo+ID4+ DQo+ID4+DQo+ID4gX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19f X18NCj4gPiBwMnAtaGFja2VycyBtYWlsaW5nIGxpc3QNCj4gPiBwMnAtaGFja2Vyc0B6Z3Aub3Jn DQo+ID4gaHR0cDovL3pncC5vcmcvbWFpbG1hbi9saXN0aW5mby9wMnAtaGFja2Vycw0KPiA+IF9f X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fDQo+ID4gSGVyZSBp cyBhIHdlYiBwYWdlIGxpc3RpbmcgUDJQIENvbmZlcmVuY2VzOg0KPiA+IGh0dHA6Ly93d3cubmV1 cm9ncmlkLm5ldC90d2lraS9iaW4vdmlldy9NYWluL1BlZXJUb1BlZXJDb25mZXJlbmNlcw0KPiA+ DQo+ID4NCj4gPg0KPiA+DQo+ID4gWW91IGRvbid0IGdldCBubyBqdWljZSB1bmxlc3MgeW91IHNx dWVlemUNCj4gPiBMZW1vbiBPYnJpZW4sIHRoZSBUaGlyZC5fX19fX19fX19fX19fX19fX19fX19f X19fX19fX19fX19fX19fX19fX19fX19fXw0KPiA+IHAycC1oYWNrZXJzIG1haWxpbmcgbGlzdA0K PiA+IHAycC1oYWNrZXJzQHpncC5vcmcNCj4gPiBodHRwOi8vemdwLm9yZy9tYWlsbWFuL2xpc3Rp bmZvL3AycC1oYWNrZXJzDQo+ID4gX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19f X19fX19fX19fX18NCj4gPiBIZXJlIGlzIGEgd2ViIHBhZ2UgbGlzdGluZyBQMlAgQ29uZmVyZW5j ZXM6DQo+ID4gaHR0cDovL3d3dy5uZXVyb2dyaWQubmV0L3R3aWtpL2Jpbi92aWV3L01haW4vUGVl clRvUGVlckNvbmZlcmVuY2VzDQo+ID4NCj4gDQo+IA0KPiBUcmF2aXMgS2FsYW5pY2sNCj4gUmVk IFN3b29zaCwgSW5jLg0KPiBGb3VuZGVyLCBDaGFpcm1hbg0KPiB0cmF2aXNAcmVkc3dvb3NoLm5l dA0KPiAodikgMzEwLjY2Ni4xNDI5DQo+IChmKSAyNTMuMzIyLjk0NzgNCj4gQUlNOiBTY291clRy YXYxMjMNCj4gX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX18N Cj4gcDJwLWhhY2tlcnMgbWFpbGluZyBsaXN0DQo+IHAycC1oYWNrZXJzQHpncC5vcmcNCj4gaHR0 cDovL3pncC5vcmcvbWFpbG1hbi9saXN0aW5mby9wMnAtaGFja2Vycw0KPiBfX19fX19fX19fX19f X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fXw0KPiBIZXJlIGlzIGEgd2ViIHBhZ2Ug bGlzdGluZyBQMlAgQ29uZmVyZW5jZXM6DQo+IGh0dHA6Ly93d3cubmV1cm9ncmlkLm5ldC90d2lr aS9iaW4vdmlldy9NYWluL1BlZXJUb1BlZXJDb25mZXJlbmNlcw0K __________________________________________________ Do You Yahoo!? ????????G?????????????????????????????????????? http://cn.mail.yahoo.com/?id=77071 From gbildson at limepeer.com Thu Dec 1 16:36:14 2005 From: gbildson at limepeer.com (Greg Bildson) Date: Sat Dec 9 22:13:05 2006 Subject: [p2p-hackers] p2p framework In-Reply-To: <71b79fa90511301016q54c57883se119ee54ea01212c@mail.gmail.com> Message-ID: You could build off the limewire.org open source code (currently down) or off of the gtk-gnutella or gnucleus source. You would want to form a separate network since current vendors would consider alternate uses of the existing network as pollution. If you envisioned similar services as exist or extensions to the existing services then this might make sense. If you want something as a basis for a new clean protocol, I might not recommend it since some aspects of the protocol are a little ugly underneath the covers. However, extension mechanisms do for most message types. Thanks -greg > -----Original Message----- > From: p2p-hackers-bounces@zgp.org [mailto:p2p-hackers-bounces@zgp.org]On > Behalf Of Davide Carboni > Sent: Wednesday, November 30, 2005 1:17 PM > To: Peer-to-peer development. > Subject: Re: [p2p-hackers] p2p framework > > > On 11/29/05, Greg Bildson wrote: > > Do you mean Gnutella's use as a framework or otherwise? > > > > Yes I do. My question is: are there some implementation of gnutella > that can be used to build upon new applications and to develop new > services (beyond simple file sharing) ? > > D. > _______________________________________________ > p2p-hackers mailing list > p2p-hackers@zgp.org > http://zgp.org/mailman/listinfo/p2p-hackers > _______________________________________________ > Here is a web page listing P2P Conferences: > http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences From unixsmaxer at hotmail.com Thu Dec 1 18:28:16 2005 From: unixsmaxer at hotmail.com (Salem Mark) Date: Sat Dec 9 22:13:05 2006 Subject: [p2p-hackers] p2p framework In-Reply-To: Message-ID: >From: "Greg Bildson" >You could build off the limewire.org open source code (currently down) or >off of the gtk-gnutella or gnucleus source. You would want to form a >separate network since current vendors would consider alternate uses of the >existing network as pollution. Could you please elaborate on forming a separate network under Gnutella? I was thinking of using the Echomine-Muse gnutella API, which facilitates sending custom messages in Gnutella, as a technique for Jabber Servers to collorabote and achieve global service discovery. Thanks. - Salem > >If you envisioned similar services as exist or extensions to the existing >services then this might make sense. If you want something as a basis for >a >new clean protocol, I might not recommend it since some aspects of the >protocol are a little ugly underneath the covers. However, extension >mechanisms do for most message types. > >Thanks >-greg > > > -----Original Message----- > > From: p2p-hackers-bounces@zgp.org [mailto:p2p-hackers-bounces@zgp.org]On > > Behalf Of Davide Carboni > > Sent: Wednesday, November 30, 2005 1:17 PM > > To: Peer-to-peer development. > > Subject: Re: [p2p-hackers] p2p framework > > > > > > On 11/29/05, Greg Bildson wrote: > > > Do you mean Gnutella's use as a framework or otherwise? > > > > > > > Yes I do. My question is: are there some implementation of gnutella > > that can be used to build upon new applications and to develop new > > services (beyond simple file sharing) ? > > > > D. > > _______________________________________________ > > p2p-hackers mailing list > > p2p-hackers@zgp.org > > http://zgp.org/mailman/listinfo/p2p-hackers > > _______________________________________________ > > Here is a web page listing P2P Conferences: > > http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences > > >_______________________________________________ >p2p-hackers mailing list >p2p-hackers@zgp.org >http://zgp.org/mailman/listinfo/p2p-hackers >_______________________________________________ >Here is a web page listing P2P Conferences: >http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences _________________________________________________________________ No masks required! Use MSN Messenger to chat with friends and family. http://go.msnserver.com/HK/25382.asp From unixsmaxer at hotmail.com Thu Dec 1 18:38:46 2005 From: unixsmaxer at hotmail.com (Salem Mark) Date: Sat Dec 9 22:13:05 2006 Subject: [p2p-hackers] DHTs in highly-transient networks Message-ID: Hello, I have read in several papers that it is unlikely that the integrity of the DHT can be maintained where there is a high node or link failure rate without significant message transmission overhead. In other words, it is mentioned that, in "highly transient networks", where the number of nodes appearing and disappearing are very high, maintaining the DHT becomes hard and introduces considerable overhead. I am trying to find out what exactly "highly-transient" means. A file sharing network like Gnutella, seems to be highly transient, where peers join/leave the network frequently. Could somebody elaborate on this? is there a node departure/arrival/failure rate (per sec? per min?) that identifies "highly-transient" networks ? Thanks - Salem _________________________________________________________________ FREE English Booklet! Improve your English. http://www.linguaphonenet.com/BannerTrack.asp?EMSCode=MSN03-08ETFJ-0211E From gbildson at limepeer.com Thu Dec 1 18:44:00 2005 From: gbildson at limepeer.com (Greg Bildson) Date: Sat Dec 9 22:13:05 2006 Subject: [p2p-hackers] p2p framework In-Reply-To: Message-ID: There is a standard connect string in Gnutella. Something like "Gnutella Connect" / "Gnutella OK". Change that. Set up your own Gwebcache or UHC (UDP host cache) or include your own gnutella.net ip:ports file and you should be able to bootstrap your own network. I'm not aware of any mainstream users of the Echomine-Muse libraries. They may or may not work. I expect that they are primitive compared to the LimeWire and gtk-gnutella code. However, they may work for your purposes. Thanks -greg > -----Original Message----- > From: p2p-hackers-bounces@zgp.org [mailto:p2p-hackers-bounces@zgp.org]On > Behalf Of Salem Mark > Sent: Thursday, December 01, 2005 1:28 PM > To: p2p-hackers@zgp.org > Subject: RE: [p2p-hackers] p2p framework > > > > >From: "Greg Bildson" > >You could build off the limewire.org open source code (currently down) or > >off of the gtk-gnutella or gnucleus source. You would want to form a > >separate network since current vendors would consider alternate > uses of the > >existing network as pollution. > > Could you please elaborate on forming a separate network under Gnutella? > > I was thinking of using the Echomine-Muse gnutella API, which facilitates > sending custom messages in Gnutella, as a technique for Jabber Servers to > collorabote and achieve global service discovery. > > Thanks. > > - Salem > > > > > > > > > > > > > >If you envisioned similar services as exist or extensions to the existing > >services then this might make sense. If you want something as a > basis for > >a > >new clean protocol, I might not recommend it since some aspects of the > >protocol are a little ugly underneath the covers. However, extension > >mechanisms do for most message types. > > > >Thanks > >-greg > > > > > -----Original Message----- > > > From: p2p-hackers-bounces@zgp.org > [mailto:p2p-hackers-bounces@zgp.org]On > > > Behalf Of Davide Carboni > > > Sent: Wednesday, November 30, 2005 1:17 PM > > > To: Peer-to-peer development. > > > Subject: Re: [p2p-hackers] p2p framework > > > > > > > > > On 11/29/05, Greg Bildson wrote: > > > > Do you mean Gnutella's use as a framework or otherwise? > > > > > > > > > > Yes I do. My question is: are there some implementation of gnutella > > > that can be used to build upon new applications and to develop new > > > services (beyond simple file sharing) ? > > > > > > D. > > > _______________________________________________ > > > p2p-hackers mailing list > > > p2p-hackers@zgp.org > > > http://zgp.org/mailman/listinfo/p2p-hackers > > > _______________________________________________ > > > Here is a web page listing P2P Conferences: > > > http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences > > > > > >_______________________________________________ > >p2p-hackers mailing list > >p2p-hackers@zgp.org > >http://zgp.org/mailman/listinfo/p2p-hackers > >_______________________________________________ > >Here is a web page listing P2P Conferences: > >http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences > > _________________________________________________________________ > No masks required! Use MSN Messenger to chat with friends and family. > http://go.msnserver.com/HK/25382.asp > > _______________________________________________ > p2p-hackers mailing list > p2p-hackers@zgp.org > http://zgp.org/mailman/listinfo/p2p-hackers > _______________________________________________ > Here is a web page listing P2P Conferences: > http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences From rrrw at neofonie.de Thu Dec 1 20:48:45 2005 From: rrrw at neofonie.de (Ronald Wertlen) Date: Sat Dec 9 22:13:05 2006 Subject: [p2p-hackers] Re: scalability In-Reply-To: <20051130223956.C99CC3FEE8@capsicum.zgp.org> References: <20051130223956.C99CC3FEE8@capsicum.zgp.org> Message-ID: <438F61AD.80406@neofonie.de> Hi Adam, perhaps you have not understood my message because you have not noticed the focus on "precision and recall" (i.e. search) not the old Distributed DB vs. own DB debate. You have also pigeon-holed my email with the DHT crowd (*grin*), it couldn't be further from it! I was arguing in the other direction - which coderman thankfully picked up. Gnutella doesn't structure enough, that's all. Sure Gnutella beats DHTs on search - I base that observation on a project I finished last year - a public prototype that used JXTA and was honed for search using super-peers [DFN S2S http://s2s.neofonie.de/ (German site) - we've moved on some since them ;) ]. Gnutella 0.6 (is there a 0.7 protocol, I can't find it?) allows practically anyone to elevate to super-peer, which results in a random (power-law distribtion) network. Such a network is not going to perform very well as far as recall and precision are concerned, past a certain point. I would be interested to calculate that exact point (but doubting I'll get to it some time soon :-/). HTH. Best regards, Ron PS. seems this thread has driven the original author to reformulate his statment... :-) PPS. In fact, the network is not going to be completely random - it will follow the contours of the internet (distribution of servers, broadband connections, users, etc. is not random). I am not sure if that destroys or supports my argument. Back to the drawing board! We actually need a better internet. [oops there I go getting unspecific again, sorry!! ;-) ] > Message: 4 > Date: Wed, 30 Nov 2005 16:42:39 -0500 > From: Adam Fisk > Subject: Re: [p2p-hackers] Re: scalability > To: "Peer-to-peer development." > Message-ID: <438E1CCF.4010907@speedymail.org> > Content-Type: text/plain; charset=ISO-8859-1; format=flowed > > I don't understand your post. When you say "critical", I assume you're > talking about life and death situations? Are you talking about anything > specifically? DHTs have failure rates. Ad hoc and mesh networks can > become useful in emergency situations where conventional infrastructures > break down, but the centralized/p2p/structured/unstructured questions > here are far from obvious. > > On the "obsessive science types" issue, this completely misses the > point. It's a very non "obsessive science type" statement. There are > strong reasons for using the massive indexing/random walk approach above > DHTs -- reasons that have nothing to do with scalability. In > particulary, DHTs are, well, hash tables. Hash tables don't work well > for metadata queries. They do fine for keywords (hotspots are a > problem, but they can be solved), but they aren't as nice a fit for > metadata. RDF and DHTs are tough to squeeze together, for example. The > massive indexing (mutual index caching to use Serguei's term)/random > walk approach can get around these issues more easily. They are also > not nearly as brittle as DHTs. Sure, DHTs repair themselves after node > joins and leaves, but node transience generally has a much greater > effect on DHTs than it does on massive indexing networks. > > I also think you're underestimating the efficiency of massive indexing > and random walks. Sure, these networks don't scale logarithmically, but > they do pretty darn well. > > I encourage everyone to stay specific with their posts. > > All the Best, > > Adam From agthorr at cs.uoregon.edu Thu Dec 1 20:52:16 2005 From: agthorr at cs.uoregon.edu (Daniel Stutzbach) Date: Sat Dec 9 22:13:05 2006 Subject: [p2p-hackers] Re: scalability In-Reply-To: <438F61AD.80406@neofonie.de> References: <20051130223956.C99CC3FEE8@capsicum.zgp.org> <438F61AD.80406@neofonie.de> Message-ID: <20051201205215.GF5300@cs.uoregon.edu> On Thu, Dec 01, 2005 at 09:48:45PM +0100, Ronald Wertlen wrote: > Gnutella 0.6 (is there a 0.7 protocol, I can't find it?) allows > practically anyone to elevate to super-peer, which results in a random > (power-law distribtion) network. Gnutella is not a power-law network. See my paper on the graph properties of Gnutella, presented at the Internet Measurement Conference earlier this year: http://www.usenix.org/events/imc05/tech/stutzbach.html > Such a network is not going to perform very well as far as recall > and precision are concerned, past a certain point. I would be > interested to calculate that exact point (but doubting I'll get to > it some time soon :-/). Could you rigorously define recall and precision for me? I'm not sure what you mean by these terms. -- Daniel Stutzbach Computer Science Ph.D Student http://www.barsoom.org/~agthorr University of Oregon From afisk at speedymail.org Thu Dec 1 21:09:22 2005 From: afisk at speedymail.org (Adam Fisk) Date: Sat Dec 9 22:13:05 2006 Subject: [p2p-hackers] Re: scalability In-Reply-To: <438F61AD.80406@neofonie.de> References: <20051130223956.C99CC3FEE8@capsicum.zgp.org> <438F61AD.80406@neofonie.de> Message-ID: <438F6682.6070806@speedymail.org> Hi Ron- Apologies for the DHT pigeon-holing. I had this nagging feeling in my stomach that you may come more from the land of small world and power law networks, but I successfully supressed it! I agree with Daniel that Gnutella's not actually a power law network, although I can't remember what led me to decide that (several years ago now). If I recall correctly, it's that degrees between nodes are quite fixed and uniform. How would you prefer superpeers get elected? Superpeer election on Gnutella is fairly simple primarily because there's a scarcity of non-firewalled/NATted machines to fill their roles, so you have to sort of take what you can get. Are you referring more to which superpeers to *select* over the course of a search and not the original choice of superpeers? On the Gnutella 0.6/0.7 issue, that's really just the version of the specification for connection headers -- a frequent source of confusion. Gnutella has rightfully evolved into a family of protocols that themselves have version numbers -- everything from superpeers to dynamic querying to bloom filter exchange and mesh downloading. All of these evolve largely independently from one another, giving the protocol family much more flexibility and agility. All the Best, Adam Ronald Wertlen wrote: > Hi Adam, > > perhaps you have not understood my message because you have not > noticed the focus on "precision and recall" (i.e. search) not the old > Distributed DB vs. own DB debate. You have also pigeon-holed my email > with the DHT crowd (*grin*), it couldn't be further from it! > > I was arguing in the other direction - which coderman thankfully > picked up. Gnutella doesn't structure enough, that's all. Sure > Gnutella beats DHTs on search - I base that observation on a project I > finished last year - a public prototype that used JXTA and was honed > for search using super-peers [DFN S2S http://s2s.neofonie.de/ > (German site) - we've moved on some since them ;) ]. > > Gnutella 0.6 (is there a 0.7 protocol, I can't find it?) allows > practically anyone to elevate to super-peer, which results in a random > (power-law distribtion) network. Such a network is not going to > perform very well as far as recall and precision are concerned, past a > certain point. I would be interested to calculate that exact point > (but doubting I'll get to it some time soon :-/). > > HTH. > > Best regards, Ron > > PS. seems this thread has driven the original author to reformulate > his statment... :-) > > PPS. > In fact, the network is not going to be completely random - it will > follow the contours of the internet (distribution of servers, > broadband connections, users, etc. is not random). I am not sure if > that destroys or supports my argument. Back to the drawing board! > > We actually need a better internet. [oops there I go getting > unspecific again, sorry!! ;-) ] > > >> Message: 4 >> Date: Wed, 30 Nov 2005 16:42:39 -0500 >> From: Adam Fisk >> Subject: Re: [p2p-hackers] Re: scalability To: "Peer-to-peer >> development." >> Message-ID: <438E1CCF.4010907@speedymail.org> >> Content-Type: text/plain; charset=ISO-8859-1; format=flowed >> >> I don't understand your post. When you say "critical", I assume >> you're talking about life and death situations? Are you talking >> about anything specifically? DHTs have failure rates. Ad hoc and >> mesh networks can become useful in emergency situations where >> conventional infrastructures break down, but the >> centralized/p2p/structured/unstructured questions here are far from >> obvious. >> >> On the "obsessive science types" issue, this completely misses the >> point. It's a very non "obsessive science type" statement. There >> are strong reasons for using the massive indexing/random walk >> approach above DHTs -- reasons that have nothing to do with >> scalability. In particulary, DHTs are, well, hash tables. Hash >> tables don't work well for metadata queries. They do fine for >> keywords (hotspots are a problem, but they can be solved), but they >> aren't as nice a fit for metadata. RDF and DHTs are tough to squeeze >> together, for example. The massive indexing (mutual index caching to >> use Serguei's term)/random walk approach can get around these issues >> more easily. They are also not nearly as brittle as DHTs. Sure, >> DHTs repair themselves after node joins and leaves, but node >> transience generally has a much greater effect on DHTs than it does >> on massive indexing networks. >> >> I also think you're underestimating the efficiency of massive >> indexing and random walks. Sure, these networks don't scale >> logarithmically, but they do pretty darn well. >> I encourage everyone to stay specific with their posts. >> >> All the Best, >> >> Adam > > > > _______________________________________________ > p2p-hackers mailing list > p2p-hackers@zgp.org > http://zgp.org/mailman/listinfo/p2p-hackers > _______________________________________________ > Here is a web page listing P2P Conferences: > http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences > From srhea at cs.berkeley.edu Thu Dec 1 21:11:02 2005 From: srhea at cs.berkeley.edu (Sean Rhea) Date: Sat Dec 9 22:13:05 2006 Subject: [p2p-hackers] DHTs in highly-transient networks In-Reply-To: References: Message-ID: <6F218011-BE6D-4C78-A86E-EFF47546C18D@cs.berkeley.edu> On Dec 1, 2005, at 1:38 PM, Salem Mark wrote: > I have read in several papers that it is unlikely that the > integrity of the DHT can be maintained where there is a high node > or link failure rate without significant message transmission > overhead. In other words, it is mentioned that, in "highly > transient networks", where the number of nodes appearing and > disappearing are very high, maintaining the DHT becomes hard and > introduces considerable overhead. > > I am trying to find out what exactly "highly-transient" means. A > file sharing network like Gnutella, seems to be highly transient, > where peers join/leave the network frequently. Could somebody > elaborate on this? is there a node departure/arrival/failure rate > (per sec? per min?) that identifies "highly-transient" networks ? > In the Bamboo USENIX paper, we talked about the average time a node was connected to the network before disconnecting. Bamboo and Chord are definitely resilient (at a routing level) even when that period is a short as a few minutes: http://srhea.net/papers/bamboo-usenix.pdf Other DHTs may be this resilient as well, but I don't have data for them. Sean -- There is no end to the fragility of our democracy. -- Ralph Nader -------------- next part -------------- A non-text attachment was scrubbed... Name: PGP.sig Type: application/pgp-signature Size: 186 bytes Desc: This is a digitally signed message part Url : http://zgp.org/pipermail/p2p-hackers/attachments/20051201/f244438a/PGP.pgp From agthorr at cs.uoregon.edu Thu Dec 1 21:15:12 2005 From: agthorr at cs.uoregon.edu (Daniel Stutzbach) Date: Sat Dec 9 22:13:05 2006 Subject: [p2p-hackers] Re: scalability In-Reply-To: <438F6682.6070806@speedymail.org> References: <20051130223956.C99CC3FEE8@capsicum.zgp.org> <438F61AD.80406@neofonie.de> <438F6682.6070806@speedymail.org> Message-ID: <20051201211511.GH5300@cs.uoregon.edu> On Thu, Dec 01, 2005 at 04:09:22PM -0500, Adam Fisk wrote: > On the Gnutella 0.6/0.7 issue, that's really just the version of the > specification for connection headers -- a frequent source of confusion. > Gnutella has rightfully evolved into a family of protocols that > themselves have version numbers -- everything from superpeers to dynamic > querying to bloom filter exchange and mesh downloading. All of these > evolve largely independently from one another, giving the protocol > family much more flexibility and agility. I suggest adding text similar to this to the GDF Wiki main page, and changing "RFC-Gnutella 0.6" to "Gnutella Protocol Family" or the like. (Which apparently cannot be edited by normal wiki users) -- Daniel Stutzbach Computer Science Ph.D Student http://www.barsoom.org/~agthorr University of Oregon From m.rogers at cs.ucl.ac.uk Thu Dec 1 22:53:24 2005 From: m.rogers at cs.ucl.ac.uk (Michael Rogers) Date: Sat Dec 9 22:13:05 2006 Subject: [p2p-hackers] DHTs in highly-transient networks In-Reply-To: <6F218011-BE6D-4C78-A86E-EFF47546C18D@cs.berkeley.edu> References: <6F218011-BE6D-4C78-A86E-EFF47546C18D@cs.berkeley.edu> Message-ID: <438F7EE4.9030209@cs.ucl.ac.uk> Sean Rhea wrote: > In the Bamboo USENIX paper, we talked about the average time a node was > connected to the network before disconnecting. Bamboo and Chord are > definitely resilient (at a routing level) even when that period is a > short as a few minutes: To what extent does this depend on the distribution of session times as well as the mean? Kademlia assumes that old nodes will outlive new nodes, and Daniel's paper shows that Gnutella contains an emergent core of long-lived nodes - how well do Bamboo and Chord survive under non-uniform churn? Cheers, Michael From srhea at cs.berkeley.edu Thu Dec 1 23:01:51 2005 From: srhea at cs.berkeley.edu (Sean Rhea) Date: Sat Dec 9 22:13:05 2006 Subject: [p2p-hackers] DHTs in highly-transient networks In-Reply-To: <438F7EE4.9030209@cs.ucl.ac.uk> References: <6F218011-BE6D-4C78-A86E-EFF47546C18D@cs.berkeley.edu> <438F7EE4.9030209@cs.ucl.ac.uk> Message-ID: On Dec 1, 2005, at 5:53 PM, Michael Rogers wrote: > To what extent does this depend on the distribution of session > times as well as the mean? Kademlia assumes that old nodes will > outlive new nodes, and Daniel's paper shows that Gnutella contains > an emergent core of long-lived nodes - how well do Bamboo and Chord > survive under non-uniform churn? We used exponentially-distributed node lifetimes, so old nodes do not generally outlive new ones. However, I _think_ that choice only makes the problem harder, though. In particular, I would suspect that Bamboo/Chord would do just as well if old nodes lived longer than new ones, and possibly better. They won't take advantage of it like Kademlia does, but it shouldn't hurt them either. (At least that's my guess; I don't have data to prove it.) Sean -- When I see the price that you pay / I don't wanna grow up -- Tom Waits -------------- next part -------------- A non-text attachment was scrubbed... Name: PGP.sig Type: application/pgp-signature Size: 186 bytes Desc: This is a digitally signed message part Url : http://zgp.org/pipermail/p2p-hackers/attachments/20051201/326f82d2/PGP.pgp From john.casey at gmail.com Fri Dec 2 00:07:56 2005 From: john.casey at gmail.com (John Casey) Date: Sat Dec 9 22:13:05 2006 Subject: [p2p-hackers] Re: scalability (was: p2p framework) In-Reply-To: <438E067B.2040408@neofonie.de> References: <20051130095529.6CAE83FEB8@capsicum.zgp.org> <438E067B.2040408@neofonie.de> Message-ID: On 12/1/05, Ronald Wertlen wrote: > Hi, > > Gnutella-bashing certainly may be fun, the truth is, it is tremendously > well-adapted for its purpose (I think Serguei's said the relevant stuff). > > However, I also believe it is pretty clear that from a search point of > view, a random super-peer based network does not scale - it is never > going to get the kind of precision and recall that we would call > intelligent. It would be too slow or too inaccurate. But if you index everything in some sort of distributed inverted index on top of a DHT a lot of document postings and related meta data still have to be exported to the network which isn't such a great solution either. The worst thing is that semantically close terms and documents are going to be scattered to random locations to remote locations in the network for indexing. Personally what I think is needed here is a slightly coarser indexing structure. So that instead of publishing 1000s of term->document pointers or at the other extreme a few term->peer as with PlanetP there is some sort of middle ground such as term->cluster-id which is better able to direct a search to sensible peers. The difficulty of course with this approach is that it isn't that easy to construct sensible global clusters from local cluster definitions as different local document databases will index different terms and the like. From baoguai2000 at gmail.com Fri Dec 2 03:06:21 2005 From: baoguai2000 at gmail.com (zheng j) Date: Sat Dec 9 22:13:05 2006 Subject: [p2p-hackers] help:is anyone interested in living streaming or vod over p2p? Message-ID: SGksIEkgYW0gbm93IGRvaW5nIHJlc2VhcmNoIG9uIGxpdmluZyBzdHJlYW1pbmcgYW5kIFZvZCBv dmVyIHAycCwgYnV0CkkgZG9uJ3Qga25vdyB3aG8gY2FuIEkgZGlzY3VzcyBteSBpZGVhIHdpdGgs IHlvdSBrbm93LCB3aXRob3V0IGlkZWEKZXhjaGFuZ2UsIEkgZmVlbCB2ZXJ5IGNvbmZ1c2VkIGFu ZCBhbm5veWVkLiBXaG8gY2FuIHRlbGwgbWUgd2hpY2gKd2Vic2l0ZSBJIGNhbiBmaW5kIHNvbWVv bmUgaW50ZXJlc3RlZCBpbiBpdD8gQW5kLCBpZiB5b3UgYXJlCmludGVyZXN0ZWQgaW4gaXQsIHBs ZWFzZSBjb250YWN0IG1loaMK From joaquin.keller at francetelecom.com Fri Dec 2 04:05:36 2005 From: joaquin.keller at francetelecom.com (KELLER Joaquin RD-MAPS-ISS) Date: Sat Dec 9 22:13:05 2006 Subject: [p2p-hackers] help:is anyone interested in living streaming or vodover p2p? In-Reply-To: References: Message-ID: Hi Zheng, We are working on that live streaming (not on VoD) http://pulse.netofpeers.net/ -- Joaquin On 12/1/05, zheng j wrote: > > > Hi, I am now doing research on living streaming and Vod over p2p, but > I don't know who can I discuss my idea with, you know, without idea > exchange, I feel very confused and annoyed. Who can tell me which > website I can find someone interested in it? And, if you are > interested in it, please contact me。 > > -- ___________________________________________________________ Joaquin Keller MAPS/MMC - France Telecom - Division R&D 38-40, rue du General Leclerc 92794 Issy Moulineaux Cedex 9 Tel: +33 (0)1 45 29 52 86 Fax: +33 (0)1 45 29 52 94 joaquin.keller@rd.francetelecom.com http://solipsis.netofpeers.net/ From redist-p2p-hackers at lothar.com Fri Dec 2 07:54:31 2005 From: redist-p2p-hackers at lothar.com (Brian Warner) Date: Sat Dec 9 22:13:05 2006 Subject: [p2p-hackers] Re: P2P in SFC Message-ID: <20051201.235431.65614928.warner@lothar.com> > Regardless, let's wait for the final guest list before deciding if we > switch locales. I'll be there too. Thanks for setting this up! -Brian From aloeser at cs.tu-berlin.de Fri Dec 2 09:28:49 2005 From: aloeser at cs.tu-berlin.de (=?ISO-8859-1?Q?Alexander_L=F6ser?=) Date: Sat Dec 9 22:13:05 2006 Subject: [p2p-hackers] Re: scalability In-Reply-To: <438F61AD.80406@neofonie.de> References: <20051130223956.C99CC3FEE8@capsicum.zgp.org> <438F61AD.80406@neofonie.de> Message-ID: <439013D1.7020803@cs.tu-berlin.de> Hi Adam, originally there was a certain type of clustering in the beginnings of Gnutella (late 90ies) . People communicate its ids mouth to mouth or via Email or deja news to other people. So in most cases you got Ids from people which had at least similar interests, or from people where you expected some interesting files. Later, due to the overwhelming attractiveness of the gnutella application they introduced the gtk and other bootstrapping alternatives, given you a number of starting pointers. However, this starting points a chosen 'randomly', so there is no longer any clustering by interests. We (Berlin and Karlsruhe) developed a new protocol (INGA Interest based Node Grouping Algorithm [1][2]) , that reclusters the network based on the interests of the peers, without any DHT, only using on an unstructured network. Similar to freenet, the network topology evolves over a while to a so called small world topology, where people with similar interests are clustered together. In addition, to further speed up the clustering process, peers also keep in a local index structures other peers, that are 'HUBs' in the network, e.g. having a high in and out degree. Our experiments show, that we significantly outperform Gnutella style approaches in messages even in highly volatile networks. Best's Alex [1] Searching Dynamic Communities with Personal Indexes. L?ser, Tempich et.al 3rd. International Semantic Web Conference, Galway. Springer 2005 http://cis.cs.tu-berlin.de/~aloeser/publications/iswc2005.pdf [2] Remindin': Semantic query routing in peer-to-peer networks based on social metaphors. Tempich et.al. WWW 2004, New York. ACM 2004 http://**www.aifb.uni-karlsruhe.de/ Publikationen/showPublikation?publ_id=447 Ronald Wertlen schrieb: > Hi Adam, > > perhaps you have not understood my message because you have not > noticed the focus on "precision and recall" (i.e. search) not the old > Distributed DB vs. own DB debate. You have also pigeon-holed my email > with the DHT crowd (*grin*), it couldn't be further from it! > > I was arguing in the other direction - which coderman thankfully > picked up. Gnutella doesn't structure enough, that's all. Sure > Gnutella beats DHTs on search - I base that observation on a project I > finished last year - a public prototype that used JXTA and was honed > for search using super-peers [DFN S2S http://s2s.neofonie.de/ > (German site) - we've moved on some since them ;) ]. > > Gnutella 0.6 (is there a 0.7 protocol, I can't find it?) allows > practically anyone to elevate to super-peer, which results in a random > (power-law distribtion) network. Such a network is not going to > perform very well as far as recall and precision are concerned, past a > certain point. I would be interested to calculate that exact point > (but doubting I'll get to it some time soon :-/). > > HTH. > > Best regards, Ron > > PS. seems this thread has driven the original author to reformulate > his statment... :-) > > PPS. > In fact, the network is not going to be completely random - it will > follow the contours of the internet (distribution of servers, > broadband connections, users, etc. is not random). I am not sure if > that destroys or supports my argument. Back to the drawing board! > > We actually need a better internet. [oops there I go getting > unspecific again, sorry!! ;-) ] > > >> Message: 4 >> Date: Wed, 30 Nov 2005 16:42:39 -0500 >> From: Adam Fisk >> Subject: Re: [p2p-hackers] Re: scalability To: "Peer-to-peer >> development." >> Message-ID: <438E1CCF.4010907@speedymail.org> >> Content-Type: text/plain; charset=ISO-8859-1; format=flowed >> >> I don't understand your post. When you say "critical", I assume >> you're talking about life and death situations? Are you talking >> about anything specifically? DHTs have failure rates. Ad hoc and >> mesh networks can become useful in emergency situations where >> conventional infrastructures break down, but the >> centralized/p2p/structured/unstructured questions here are far from >> obvious. >> >> On the "obsessive science types" issue, this completely misses the >> point. It's a very non "obsessive science type" statement. There >> are strong reasons for using the massive indexing/random walk >> approach above DHTs -- reasons that have nothing to do with >> scalability. In particulary, DHTs are, well, hash tables. Hash >> tables don't work well for metadata queries. They do fine for >> keywords (hotspots are a problem, but they can be solved), but they >> aren't as nice a fit for metadata. RDF and DHTs are tough to squeeze >> together, for example. The massive indexing (mutual index caching to >> use Serguei's term)/random walk approach can get around these issues >> more easily. They are also not nearly as brittle as DHTs. Sure, >> DHTs repair themselves after node joins and leaves, but node >> transience generally has a much greater effect on DHTs than it does >> on massive indexing networks. >> >> I also think you're underestimating the efficiency of massive >> indexing and random walks. Sure, these networks don't scale >> logarithmically, but they do pretty darn well. >> I encourage everyone to stay specific with their posts. >> >> All the Best, >> >> Adam > > > > _______________________________________________ > p2p-hackers mailing list > p2p-hackers@zgp.org > http://zgp.org/mailman/listinfo/p2p-hackers > _______________________________________________ > Here is a web page listing P2P Conferences: > http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences > -- ___________________________________________________________ Dr. Alexander L?ser, Technische Universit?t Berlin, CIS, Sekr. EN 7, Einsteinufer 17, 10587 Berlin, GERMANY office: +49- 30-314-25556 fax: +49- 30-314-21601 web: http://cis.cs.tu-berlin.de/~aloeser/ ___________________________________________________________ From gwendal.simon at francetelecom.com Fri Dec 2 09:38:14 2005 From: gwendal.simon at francetelecom.com (SIMON Gwendal RD-MAPS-ISS) Date: Sat Dec 9 22:13:05 2006 Subject: [p2p-hackers] Re: scalability Message-ID: Hi Alexander, This work is close to the one we perform for Maay [1]. As we just begin to implement it, it could be great if you can participate to the early protocol discussion on the mailing-list. The current Maay implementation [2] is very open. We develop a basic indexer that communicates through XML-RPC to the "Maay node". The "Maay node" manages communication and the sql database. It can be controlled through a web interface. Have fun ! -- Gwendal [1]: MAAY: a decentralized personalized search system, F. Dang Ngoc, J. Keller, G. Simon. SAINT'2006 http://maay.netofpeers.net/documentation/maay_SAINT2006.pdf [2]: http://maay.netofpeers.net > -----Message d'origine----- > De : p2p-hackers-bounces@zgp.org > [mailto:p2p-hackers-bounces@zgp.org] De la part de Alexander L?ser > Envoy? : vendredi 2 d?cembre 2005 10:29 > ? : Peer-to-peer development. > Objet : Re: [p2p-hackers] Re: scalability > > Hi Adam, > originally there was a certain type of clustering in the > beginnings of > Gnutella (late 90ies) . People communicate its ids mouth to > mouth or via > Email or deja news to other people. So in most cases you got Ids from > people which had at least similar interests, or from people > where you > expected some interesting files. Later, due to the overwhelming > attractiveness of the gnutella application they introduced > the gtk and > other bootstrapping alternatives, given you a number of starting > pointers. However, this starting points a chosen 'randomly', > so there is > no longer any clustering by interests. > > We (Berlin and Karlsruhe) developed a new protocol (INGA > Interest based > Node Grouping Algorithm [1][2]) , that reclusters the network > based on > the interests of the peers, without any DHT, only using on an > unstructured network. Similar to freenet, the network > topology evolves > over a while to a so called small world topology, where people with > similar interests are clustered together. In addition, to > further speed > up the clustering process, peers also keep in a local index > structures > other peers, that are 'HUBs' in the network, e.g. having a > high in and > out degree. Our experiments show, that we significantly outperform > Gnutella style approaches in messages even in highly volatile > networks. > > Best's Alex > > [1] Searching Dynamic Communities with Personal Indexes. > L?ser, Tempich > et.al 3rd. International Semantic Web Conference, Galway. > Springer 2005 > http://cis.cs.tu-berlin.de/~aloeser/publications/iswc2005.pdf > [2] Remindin': Semantic query routing in peer-to-peer > networks based on > social metaphors. Tempich et.al. WWW 2004, New York. ACM 2004 > http://**www.aifb.uni-karlsruhe.de/ > Publikationen/showPublikation?publ_id=447 > > Ronald Wertlen schrieb: > > > Hi Adam, > > > > perhaps you have not understood my message because you have not > > noticed the focus on "precision and recall" (i.e. search) > not the old > > Distributed DB vs. own DB debate. You have also > pigeon-holed my email > > with the DHT crowd (*grin*), it couldn't be further from it! > > > > I was arguing in the other direction - which coderman thankfully > > picked up. Gnutella doesn't structure enough, that's all. Sure > > Gnutella beats DHTs on search - I base that observation on > a project I > > finished last year - a public prototype that used JXTA and > was honed > > for search using super-peers [DFN S2S http://s2s.neofonie.de/ > > (German site) - we've moved on some since them ;) ]. > > > > Gnutella 0.6 (is there a 0.7 protocol, I can't find it?) allows > > practically anyone to elevate to super-peer, which results > in a random > > (power-law distribtion) network. Such a network is not going to > > perform very well as far as recall and precision are > concerned, past a > > certain point. I would be interested to calculate that exact point > > (but doubting I'll get to it some time soon :-/). > > > > HTH. > > > > Best regards, Ron > > > > PS. seems this thread has driven the original author to reformulate > > his statment... :-) > > > > PPS. > > In fact, the network is not going to be completely random - it will > > follow the contours of the internet (distribution of servers, > > broadband connections, users, etc. is not random). I am not sure if > > that destroys or supports my argument. Back to the drawing board! > > > > We actually need a better internet. [oops there I go getting > > unspecific again, sorry!! ;-) ] > > > > > >> Message: 4 > >> Date: Wed, 30 Nov 2005 16:42:39 -0500 > >> From: Adam Fisk > >> Subject: Re: [p2p-hackers] Re: scalability To: "Peer-to-peer > >> development." > >> Message-ID: <438E1CCF.4010907@speedymail.org> > >> Content-Type: text/plain; charset=ISO-8859-1; format=flowed > >> > >> I don't understand your post. When you say "critical", I assume > >> you're talking about life and death situations? Are you talking > >> about anything specifically? DHTs have failure rates. Ad hoc and > >> mesh networks can become useful in emergency situations where > >> conventional infrastructures break down, but the > >> centralized/p2p/structured/unstructured questions here are > far from > >> obvious. > >> > >> On the "obsessive science types" issue, this completely misses the > >> point. It's a very non "obsessive science type" statement. There > >> are strong reasons for using the massive indexing/random walk > >> approach above DHTs -- reasons that have nothing to do with > >> scalability. In particulary, DHTs are, well, hash tables. Hash > >> tables don't work well for metadata queries. They do fine for > >> keywords (hotspots are a problem, but they can be solved), > but they > >> aren't as nice a fit for metadata. RDF and DHTs are tough > to squeeze > >> together, for example. The massive indexing (mutual index > caching to > >> use Serguei's term)/random walk approach can get around > these issues > >> more easily. They are also not nearly as brittle as DHTs. Sure, > >> DHTs repair themselves after node joins and leaves, but node > >> transience generally has a much greater effect on DHTs > than it does > >> on massive indexing networks. > >> > >> I also think you're underestimating the efficiency of massive > >> indexing and random walks. Sure, these networks don't scale > >> logarithmically, but they do pretty darn well. > >> I encourage everyone to stay specific with their posts. > >> > >> All the Best, > >> > >> Adam > > > > > > > > _______________________________________________ > > p2p-hackers mailing list > > p2p-hackers@zgp.org > > http://zgp.org/mailman/listinfo/p2p-hackers > > _______________________________________________ > > Here is a web page listing P2P Conferences: > > http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences > > > > > -- > ___________________________________________________________ > > Dr. Alexander L?ser, > Technische Universit?t Berlin, > CIS, Sekr. EN 7, Einsteinufer 17, 10587 Berlin, GERMANY > office: +49- 30-314-25556 fax: +49- 30-314-21601 > web: http://cis.cs.tu-berlin.de/~aloeser/ > ___________________________________________________________ > > _______________________________________________ > p2p-hackers mailing list > p2p-hackers@zgp.org > http://zgp.org/mailman/listinfo/p2p-hackers > _______________________________________________ > Here is a web page listing P2P Conferences: > http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences > From ian.clarke at gmail.com Fri Dec 2 12:07:32 2005 From: ian.clarke at gmail.com (Ian Clarke) Date: Sat Dec 9 22:13:05 2006 Subject: [p2p-hackers] darknet ~= (blacknet, f2f net) In-Reply-To: <20051129140314.046DD698@yumyum.zooko.com> References: <200511291414.35852.01771@iha.dk> <20051129140314.046DD698@yumyum.zooko.com> Message-ID: <823242bd0512020407i252b84c4u@mail.gmail.com> On 29/11/05, zooko@zooko.com wrote: > However, the media seems to have started using the word "Darknet" to mean a > friend-to-friend net and/or a blacknet [7, 8], thus simultaneously making it > harder for people to think about blacknets which are based on other than > friend-to-friend architectures and making it harder for people to think about > friend-to-friend networks which are used for other than illegal information > sharing. > > I place some of the blame for this development on the Freenet folks, who may be > the first to promulgate this munging, and if they aren't the first they're > certainly the most effective. As Michael Rogers pointed out, I am not sure this is as clear-cut as you suggest, the goal for Freenet 0.7 is very close to the idea outlined in the caption for Fig. 3 of the Microsoft Darknet paper, which is a friend-to-friend network. That paper may be the first common usage of the term "darknet", but so far as I can see, it contains no concise definition of what a "darknet" is. I would therefore say that there is no authorative basis on which to invalidate any particular definition of the term that is broadly within the area of P2P networks which conceal user activity. As such, defining the term "darknet" as a f2f network that is designed to conceal the activities of its participants (this being, so far as I have seen, one of the main motivations for building an f2f network), is as valid a definition as any other I have seen (and more useful than most). As a side-point, I think it is somewhat pejorative to say that any technology is "designed" for illegal usage, just because it conceals user activity and therefore may be capable of illegal usage. There are many legal reasons why people might wish to preserve their anonymity and privacy. Ian. From adam at cypherspace.org Fri Dec 2 13:35:16 2005 From: adam at cypherspace.org (Adam Back) Date: Sat Dec 9 22:13:05 2006 Subject: idealized content network properties (Re: [p2p-hackers] darknet) In-Reply-To: <823242bd0512020407i252b84c4u@mail.gmail.com> References: <200511291414.35852.01771@iha.dk> <20051129140314.046DD698@yumyum.zooko.com> <823242bd0512020407i252b84c4u@mail.gmail.com> Message-ID: <20051202133516.GA15480@bitchcake.off.net> I think an ideal www2 network should: 1. have any content searchable by anyone (the contents are public) 2. make it hard to determine who the author of content is 3. make it hard for people other than the author to remove content 4. make it hard for people to observe what other people are downloading 5. make it hard for anyone to change content (new version and navigating by version should be the way to "change") It seems to me that this network can provide any of these subset classifications trivially. removing 1 makes a eg "friend-to-friend" network -- that just means you encrypt the searchable tags and content with a shared key. removing 2 you just sign the content. and so forth. (Making it hard for people other than the author to remove content technically probably involves things like redundancy, transience of service, opaque content to its current server location, indirection etc) (The author also should be able to arrange that he himself can't remove the content, by intentionally discarding whatever keys give him the technical means to remove or change the content). > As a side-point, I think it is somewhat pejorative to say that any > technology is "designed" for illegal usage, just because it conceals > user activity and therefore may be capable of illegal usage. There > are many legal reasons why people might wish to preserve their > anonymity and privacy. Yeah. I think my feature set at the top should be the default/base set of properties exhibited by the www2 (next gen web). Any voluntary restrictions on these should be entered into by policy. Say content X is illegal in jurisdiction Y, then Y should publish a blacklist identifying content X and the legal system in jurisdiction Y should if it chooses make it illegal to not consult the blacklist. I mean illegality is not even consistent, there are things which are legally required in Y that are illegal in Z. There is and can be no globally acceptable policy, so we must robustly technologically prevent global enforcement. Adam From m.rogers at cs.ucl.ac.uk Fri Dec 2 14:19:29 2005 From: m.rogers at cs.ucl.ac.uk (Michael Rogers) Date: Sat Dec 9 22:13:05 2006 Subject: idealized content network properties (Re: [p2p-hackers] darknet) In-Reply-To: <20051202133516.GA15480@bitchcake.off.net> References: <200511291414.35852.01771@iha.dk> <20051129140314.046DD698@yumyum.zooko.com> <823242bd0512020407i252b84c4u@mail.gmail.com> <20051202133516.GA15480@bitchcake.off.net> Message-ID: <439057F1.2060208@cs.ucl.ac.uk> Adam Back wrote: > removing 1 makes a eg "friend-to-friend" network -- that just means > you encrypt the searchable tags and content with a shared key. Not sure about this one - I think the use of group keys is orthogonal to the use of a friend-to-friend topology. For example Groove uses group keys without f2f, Freenet 0.7 will use f2f without group keys, and WASTE uses neither (but still fits under the "darknet" umbrella because it's invitation-only). Cheers, Michael From zooko at zooko.com Fri Dec 2 15:45:57 2005 From: zooko at zooko.com (zooko@zooko.com) Date: Sat Dec 9 22:13:05 2006 Subject: [p2p-hackers] darknet ~= (blacknet, f2f net) In-Reply-To: <823242bd0512020407i252b84c4u@mail.gmail.com> References: <200511291414.35852.01771@iha.dk> <20051129140314.046DD698@yumyum.zooko.com> <823242bd0512020407i252b84c4u@mail.gmail.com> Message-ID: <20051202154557.E559F191C@yumyum.zooko.com> Ian, p2p-hackers: It's not my goal to quibble about etymology (except inasmuch as it is useful to preserve the historical record). My goals are: 1. Avoid ambiguity -- where some people think that word X denotes concept 1, and others think that word X denotes concept 2. Especially if concepts 1 and 2 are related but not identical. Especially if one of them is politically incendiary. 2. Make sure we have names for our useful concepts. However, before I get to that I am going to go through the history one last time in order to cast light on the current problem. I turned up some interesting details. Let's start with a Venn diagram: _______ _______ / \ / \ / \ / \ / \/ \ / /\ \ / / \ \ | | | | | 1 |1^2 | 2 | | | | | | | | | \ \ / / \ \/ / \ /\ / \ / \ / \_______/ \_______/ Let 1 be the set of networks which are used for illegal transmission of information, and 2 be the set of networks which are built on f2f connections, and 1^2 be the intersection -- the set of networks which are used for illegal transmission of information and which are built on f2f connections. [bepw2002] introduces "darknet" to mean concept 1. In their words darknet is "a collection of networks and technologies used to share digital content", and they use it consistently within that meaning. They refer to concept 2, starting in section 2.1, using the term "small-world nets", and they clearly distinguish between what they call "small-world darknets" and "non-small-world darknets". However nowadays some people in the mass media seem to think that a "darknet" means primarily a network which is "invitation-only", i.e. a "small-world" or "f2f" net [globe]. When did the meaning shift? Ooh -- how interesting to examine the evolution of this word on [wikipedia]! The original definition on wikipedia was written on 2004-09-30. It read in full: "Darknet is a broad term to denote the networks and technologies that enable users to copy and share digital material. The term was coined in a paper from four Microsoft Research authors.". The next change was that two months later someone redirected the "Darknet" page to just be a link to the "Filesharing page", with the comment "Just another word for filesharing". The next change was that on 2005-04-14 someone from IP 81.178.83.245 wrote a definition beginning with this sentence: "A Darknet is a private file sharing network where users only connect to people they trust.". By the way, I should point out that I have a personal interest in this history because between 2001 and 2003 I tried to promulgate concept 2, using Lucas Gonze's coinage: "friendnet" [zooko2001, zooko2002, zooko2003, gonze2002]. I would like to know for my own satisfaction if my ideas were a direct inspiration for some of this modern stuff, such as the Freenet v0.7 design. So much for etymology. Now the problem is that in the current parlance of the media, the word "darknet" is used to mean vaguely 1 or 2 or 1^2. The reason that this is a problem isn't that it breaks with some etymological tradition, but that it is ambiguous and that it deprives us of useful words to refer to 1 or 2 specifically. The ambiguity has nasty political consequences -- see for example these f2f network operators struggling to persuade newspaper readers that they are not primarily for illegal purposes: [globe]. My proposal to rectify the lack-of-words problem is to use "blacknet" to refer to 1 specifically and "f2f net" to refer to 2 specifically. I don't know if there is any way to rectify the ambiguity problem. Ian wrote: > > ... > defining the term "darknet" as a f2f network that is designed > to conceal the activities of its participants (this being, so far as I > have seen, one of the main motivations for building an f2f network), So you think of "darknet" as meaning 1^2. That's an interesting remark -- that you regard concealment as one of the main motivations. I personally regard concealment as one of the lesser motivations -- I'm more interested in attack resistance (resisting attacks such as subversion or denial-of-service, rather than attacks such as surveillance), scalability, and other properties. Although I'm interested in the concealment properties as well. Regards, Zooko P.S. Here's some obligatory link juice for Gonze's latest sly neologism: lightnet! [bepw2002] "The darknet and the future of content distribution" Biddle, England, Peinado, Willman (Microsoft Corporation) http://crypto.stanford.edu/DRM2002/darknet5.doc http://www.dklevine.com/archive/darknet.pdf (The .doc version crashes my OpenOffice.org app when I try to read it. Does this mean something? The .pdf version has screwed up images when I view it in evince.) [wikipedia] http://en.wikipedia.org/wiki/Darknet [zooko2001] "Attack Resistant Sharing of Metadata" Zooko and Raph Levien presentation, First O'Reilly Peer-to-Peer conference, 2001 http://conferences.oreillynet.com/cs/p2p2001/view/e_sess/1200 [zooko2002] http://zooko.com/log-2002-12.html#d2002-12-14-the_human_context_and_the_future_of_Mnet [zooko2003] http://www.zooko.com/log-2003-01.html#d2003-01-23-trust_is_just_another_topology [gonze2002] http://www.oreillynet.com/pub/wlg/2428 [globe] "Darknets: The invitation-only Internet" globeandmail.com 2005-11-24 http://www.globetechnology.com/servlet/story/RTGAM.20051007.gtdarknetoct7/BNStory/Technology/ [lightnet] http://gonze.com/weblog/story/lightnet From m.rogers at cs.ucl.ac.uk Fri Dec 2 16:02:07 2005 From: m.rogers at cs.ucl.ac.uk (Michael Rogers) Date: Sat Dec 9 22:13:05 2006 Subject: [p2p-hackers] darknet ~= (blacknet, f2f net) In-Reply-To: <20051202154557.E559F191C@yumyum.zooko.com> References: <200511291414.35852.01771@iha.dk> <20051129140314.046DD698@yumyum.zooko.com> <823242bd0512020407i252b84c4u@mail.gmail.com> <20051202154557.E559F191C@yumyum.zooko.com> Message-ID: <43906FFF.3070302@cs.ucl.ac.uk> zooko@zooko.com wrote: > However nowadays some people in the mass media seem to think that a "darknet" > means primarily a network which is "invitation-only", i.e. a "small-world" or > "f2f" net [globe]. Sorry to split an already frayed hair, but invitation-only isn't the same as f2f. Invitation-only implies that you must know some member of the network, whereas f2f implies that you must know the members you connect to. For example Groove and WASTE are invitation-only but not f2f. Cheers, Michael From mccoy at mad-scientist.com Fri Dec 2 17:32:53 2005 From: mccoy at mad-scientist.com (Jim McCoy) Date: Sat Dec 9 22:13:05 2006 Subject: [p2p-hackers] Re: P2P in SFC In-Reply-To: <20051201.235431.65614928.warner@lothar.com> References: <20051201.235431.65614928.warner@lothar.com> Message-ID: <56D8091C-1D45-48C2-975C-5F6A1D47059B@mad-scientist.com> > Regardless, let's wait for the final guest list before deciding if we > switch locales. I will be there. Jim From zooko at zooko.com Fri Dec 2 17:20:47 2005 From: zooko at zooko.com (zooko@zooko.com) Date: Sat Dec 9 22:13:05 2006 Subject: [p2p-hackers] darknet.pdf Message-ID: <20051202172047.64212339@yumyum.zooko.com> Thanks to anonymous contributor. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pdf Size: 246474 bytes Desc: not available Url : http://zgp.org/pipermail/p2p-hackers/attachments/20051202/62b9c4af/attachment.pdf From coderman at gmail.com Fri Dec 2 18:08:32 2005 From: coderman at gmail.com (coderman) Date: Sat Dec 9 22:13:05 2006 Subject: [p2p-hackers] help:is anyone interested in living streaming or vod over p2p? In-Reply-To: References: Message-ID: <4ef5fec60512021008v64987949xb6880691dd2fceec@mail.gmail.com> On 12/1/05, zheng j wrote: > Hi, I am now doing research on living streaming and Vod over p2p... wireless is a natural fit for p2p streaming / broadcast distribution From Serguei.Osokine at efi.com Fri Dec 2 18:23:24 2005 From: Serguei.Osokine at efi.com (Serguei Osokine) Date: Sat Dec 9 22:13:05 2006 Subject: [p2p-hackers] Re: scalability Message-ID: <4A60C83D027E224BAA4550FB1A2B120EC42751@fcexmb04.efi.internal> On Friday, December 02, 2005 Alexander L?ser wrote: > originally there was a certain type of clustering in the beginnings > of Gnutella (late 90ies) . People communicate its ids mouth to mouth > or via Email or deja news to other people. So in most cases you got > Ids from people which had at least similar interests, or from > people where you expected some interesting files. I'm sorry to contradict you, but I think this is all a myth. First, there was no Gnutella in late 90ies. It was released in March of 2000. Second, I remember looking at the connection stability just a few months later (June/July, maybe?), and the churn was quite high - a client tended to replace all its connections within an hour or so. Now if you remember how the connections were replaced, the client was trying the IPs that it received from PONGs, which were essentially the random network IPs, because the network was just a few thousand nodes and every client could see the pongs from pretty much everyone. So in an hour or so your initial connection point stopped being relevant and you found yourself at a random place in the network. After that, all your subsequent sessions used the IP list stored on disk by a previous session to connect to the network, and the address given to you by your friends was no longer important. To be precise, this latest part (about the IP list) was the behaviour of the Gnutella clients that I worked with (I think these were Gnutella v.056 and GNUT). Maybe there were some clients that required to enter an IP at every session start. I don't know. There was also a notion of locality based on the unusually good and stable connections - as soon as the two machines on my desktop would find each other on the network as a result of this random process, they would stay connected for quite a while (as long as I did not stop the clients). But even these considerations are not important, because the early Gnutella (until the meltdown of July 2000) was fully visible, and every query more or less reached every node (in the absence of the flow control, this is exactly what caused the meltdown - TTL was too high to limit the query propagation). Of course, some queries might have been missing some nodes, but generally there was no chance for any clustering - I simply cannot see how it could possibly exist in such a network. > We (Berlin and Karlsruhe) developed a new protocol (INGA Interest > based Node Grouping Algorithm [1][2]) , that reclusters the network > based on the interests of the peers, without any DHT, only using on > an unstructured network. Which is cool, and maybe it is a great protocol - as long as you won't justify its existence by myths. I'm sure there are plenty of legitimate reasons that make this protocol useful ;-) Best wishes - S.Osokine. 2 Dec 2005. -----Original Message----- From: p2p-hackers-bounces@zgp.org [mailto:p2p-hackers-bounces@zgp.org]On Behalf Of Alexander L?ser Sent: Friday, December 02, 2005 1:29 AM To: Peer-to-peer development. Subject: Re: [p2p-hackers] Re: scalability Hi Adam, originally there was a certain type of clustering in the beginnings of Gnutella (late 90ies) . People communicate its ids mouth to mouth or via Email or deja news to other people. So in most cases you got Ids from people which had at least similar interests, or from people where you expected some interesting files. Later, due to the overwhelming attractiveness of the gnutella application they introduced the gtk and other bootstrapping alternatives, given you a number of starting pointers. However, this starting points a chosen 'randomly', so there is no longer any clustering by interests. We (Berlin and Karlsruhe) developed a new protocol (INGA Interest based Node Grouping Algorithm [1][2]) , that reclusters the network based on the interests of the peers, without any DHT, only using on an unstructured network. Similar to freenet, the network topology evolves over a while to a so called small world topology, where people with similar interests are clustered together. In addition, to further speed up the clustering process, peers also keep in a local index structures other peers, that are 'HUBs' in the network, e.g. having a high in and out degree. Our experiments show, that we significantly outperform Gnutella style approaches in messages even in highly volatile networks. Best's Alex [1] Searching Dynamic Communities with Personal Indexes. L?ser, Tempich et.al 3rd. International Semantic Web Conference, Galway. Springer 2005 http://cis.cs.tu-berlin.de/~aloeser/publications/iswc2005.pdf [2] Remindin': Semantic query routing in peer-to-peer networks based on social metaphors. Tempich et.al. WWW 2004, New York. ACM 2004 http://**www.aifb.uni-karlsruhe.de/ Publikationen/showPublikation?publ_id=447 Ronald Wertlen schrieb: > Hi Adam, > > perhaps you have not understood my message because you have not > noticed the focus on "precision and recall" (i.e. search) not the old > Distributed DB vs. own DB debate. You have also pigeon-holed my email > with the DHT crowd (*grin*), it couldn't be further from it! > > I was arguing in the other direction - which coderman thankfully > picked up. Gnutella doesn't structure enough, that's all. Sure > Gnutella beats DHTs on search - I base that observation on a project I > finished last year - a public prototype that used JXTA and was honed > for search using super-peers [DFN S2S http://s2s.neofonie.de/ > (German site) - we've moved on some since them ;) ]. > > Gnutella 0.6 (is there a 0.7 protocol, I can't find it?) allows > practically anyone to elevate to super-peer, which results in a random > (power-law distribtion) network. Such a network is not going to > perform very well as far as recall and precision are concerned, past a > certain point. I would be interested to calculate that exact point > (but doubting I'll get to it some time soon :-/). > > HTH. > > Best regards, Ron > > PS. seems this thread has driven the original author to reformulate > his statment... :-) > > PPS. > In fact, the network is not going to be completely random - it will > follow the contours of the internet (distribution of servers, > broadband connections, users, etc. is not random). I am not sure if > that destroys or supports my argument. Back to the drawing board! > > We actually need a better internet. [oops there I go getting > unspecific again, sorry!! ;-) ] > > >> Message: 4 >> Date: Wed, 30 Nov 2005 16:42:39 -0500 >> From: Adam Fisk >> Subject: Re: [p2p-hackers] Re: scalability To: "Peer-to-peer >> development." >> Message-ID: <438E1CCF.4010907@speedymail.org> >> Content-Type: text/plain; charset=ISO-8859-1; format=flowed >> >> I don't understand your post. When you say "critical", I assume >> you're talking about life and death situations? Are you talking >> about anything specifically? DHTs have failure rates. Ad hoc and >> mesh networks can become useful in emergency situations where >> conventional infrastructures break down, but the >> centralized/p2p/structured/unstructured questions here are far from >> obvious. >> >> On the "obsessive science types" issue, this completely misses the >> point. It's a very non "obsessive science type" statement. There >> are strong reasons for using the massive indexing/random walk >> approach above DHTs -- reasons that have nothing to do with >> scalability. In particulary, DHTs are, well, hash tables. Hash >> tables don't work well for metadata queries. They do fine for >> keywords (hotspots are a problem, but they can be solved), but they >> aren't as nice a fit for metadata. RDF and DHTs are tough to squeeze >> together, for example. The massive indexing (mutual index caching to >> use Serguei's term)/random walk approach can get around these issues >> more easily. They are also not nearly as brittle as DHTs. Sure, >> DHTs repair themselves after node joins and leaves, but node >> transience generally has a much greater effect on DHTs than it does >> on massive indexing networks. >> >> I also think you're underestimating the efficiency of massive >> indexing and random walks. Sure, these networks don't scale >> logarithmically, but they do pretty darn well. >> I encourage everyone to stay specific with their posts. >> >> All the Best, >> >> Adam > > > > _______________________________________________ > p2p-hackers mailing list > p2p-hackers@zgp.org > http://zgp.org/mailman/listinfo/p2p-hackers > _______________________________________________ > Here is a web page listing P2P Conferences: > http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences > -- ___________________________________________________________ Dr. Alexander L?ser, Technische Universit?t Berlin, CIS, Sekr. EN 7, Einsteinufer 17, 10587 Berlin, GERMANY office: +49- 30-314-25556 fax: +49- 30-314-21601 web: http://cis.cs.tu-berlin.de/~aloeser/ ___________________________________________________________ _______________________________________________ p2p-hackers mailing list p2p-hackers@zgp.org http://zgp.org/mailman/listinfo/p2p-hackers _______________________________________________ Here is a web page listing P2P Conferences: http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences From coderman at gmail.com Fri Dec 2 18:30:03 2005 From: coderman at gmail.com (coderman) Date: Sat Dec 9 22:13:05 2006 Subject: [p2p-hackers] p2p portland (OR) Message-ID: <4ef5fec60512021030r79ec7107j5ece5ea26b4cae85@mail.gmail.com> ping From don at dhoffman.net Fri Dec 2 18:45:12 2005 From: don at dhoffman.net (Donald Hoffman) Date: Sat Dec 9 22:13:05 2006 Subject: [p2p-hackers] p2p portland (OR) In-Reply-To: <4ef5fec60512021030r79ec7107j5ece5ea26b4cae85@mail.gmail.com> References: <4ef5fec60512021030r79ec7107j5ece5ea26b4cae85@mail.gmail.com> Message-ID: <0C7F13F7-D31C-47EB-90D0-17289D97ECAF@dhoffman.net> Pong. Also (live) in Portland. (Actually in Montana right now. Anyone there?) Don On Dec 2, 2005, at 11:30 AM, coderman wrote: > ping > _______________________________________________ > p2p-hackers mailing list > p2p-hackers@zgp.org > http://zgp.org/mailman/listinfo/p2p-hackers > _______________________________________________ > Here is a web page listing P2P Conferences: > http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences From agthorr at cs.uoregon.edu Fri Dec 2 18:51:37 2005 From: agthorr at cs.uoregon.edu (Daniel Stutzbach) Date: Sat Dec 9 22:13:05 2006 Subject: [p2p-hackers] p2p portland (OR) In-Reply-To: <4ef5fec60512021030r79ec7107j5ece5ea26b4cae85@mail.gmail.com> References: <4ef5fec60512021030r79ec7107j5ece5ea26b4cae85@mail.gmail.com> Message-ID: <20051202185136.GB2604@cs.uoregon.edu> On Fri, Dec 02, 2005 at 10:30:03AM -0800, coderman wrote: > ping I'm in Eugene. I'd be willing to drive up for a get-together if we have a big enough group to make it interesting. -- Daniel Stutzbach Computer Science Ph.D Student http://www.barsoom.org/~agthorr University of Oregon From coderman at gmail.com Fri Dec 2 19:13:33 2005 From: coderman at gmail.com (coderman) Date: Sat Dec 9 22:13:05 2006 Subject: [p2p-hackers] p2p portland (OR) In-Reply-To: <20051202185136.GB2604@cs.uoregon.edu> References: <4ef5fec60512021030r79ec7107j5ece5ea26b4cae85@mail.gmail.com> <20051202185136.GB2604@cs.uoregon.edu> Message-ID: <4ef5fec60512021113w51fa4c27q73616041994220b0@mail.gmail.com> On 12/2/05, Daniel Stutzbach wrote: > I'm in Eugene. I'd be willing to drive up for a get-together if we > have a big enough group to make it interesting. i'd be happy to travel to eugene if more of the group is located there as well. weekends would be best in that case. From gbildson at limepeer.com Fri Dec 2 19:22:32 2005 From: gbildson at limepeer.com (Greg Bildson) Date: Sat Dec 9 22:13:05 2006 Subject: [p2p-hackers] Re: scalability In-Reply-To: <4A60C83D027E224BAA4550FB1A2B120EC42751@fcexmb04.efi.internal> Message-ID: The only locality that I can think of that may have occurred back in that early timeframe would be based on the stringiness of the network. I have a feeling that pre-centralized hostcache, the network was more of a long string with some clumps as it went along. So, its possible that the network diameter at its longest point was much larger than max-TTL. Then, the introduction of centralized hostcaches helped create a massive cluster and exacerbated the early modem bandwidth barrier. This appeared to be what Gene Kan thought I believe. Its was only months later with the introduction of clients with keepalive pings and flow control that the clogged spots got freed up. If ToadNode was correct in that they had millions of downloads in those early days then thats the only way that I could see the modem bandwidth barrier not getting hit very quickly. Thanks -greg > -----Original Message----- > From: p2p-hackers-bounces@zgp.org [mailto:p2p-hackers-bounces@zgp.org]On > Behalf Of Serguei Osokine > Sent: Friday, December 02, 2005 1:23 PM > To: Peer-to-peer development. > Subject: RE: [p2p-hackers] Re: scalability > > > On Friday, December 02, 2005 Alexander L?ser wrote: > > originally there was a certain type of clustering in the beginnings > > of Gnutella (late 90ies) . People communicate its ids mouth to mouth > > or via Email or deja news to other people. So in most cases you got > > Ids from people which had at least similar interests, or from > > people where you expected some interesting files. > > I'm sorry to contradict you, but I think this is all a myth. > > First, there was no Gnutella in late 90ies. It was released in > March of 2000. Second, I remember looking at the connection stability > just a few months later (June/July, maybe?), and the churn was quite > high - a client tended to replace all its connections within an hour > or so. > > Now if you remember how the connections were replaced, the > client was trying the IPs that it received from PONGs, which were > essentially the random network IPs, because the network was just > a few thousand nodes and every client could see the pongs from > pretty much everyone. So in an hour or so your initial connection > point stopped being relevant and you found yourself at a random > place in the network. After that, all your subsequent sessions used > the IP list stored on disk by a previous session to connect to the > network, and the address given to you by your friends was no longer > important. > > To be precise, this latest part (about the IP list) was the > behaviour of the Gnutella clients that I worked with (I think these > were Gnutella v.056 and GNUT). Maybe there were some clients that > required to enter an IP at every session start. I don't know. There > was also a notion of locality based on the unusually good and stable > connections - as soon as the two machines on my desktop would find > each other on the network as a result of this random process, they > would stay connected for quite a while (as long as I did not stop > the clients). > > But even these considerations are not important, because the > early Gnutella (until the meltdown of July 2000) was fully visible, > and every query more or less reached every node (in the absence of > the flow control, this is exactly what caused the meltdown - TTL was > too high to limit the query propagation). > > Of course, some queries might have been missing some nodes, but > generally there was no chance for any clustering - I simply cannot see > how it could possibly exist in such a network. > > > We (Berlin and Karlsruhe) developed a new protocol (INGA Interest > > based Node Grouping Algorithm [1][2]) , that reclusters the network > > based on the interests of the peers, without any DHT, only using on > > an unstructured network. > > Which is cool, and maybe it is a great protocol - as long as > you won't justify its existence by myths. I'm sure there are plenty > of legitimate reasons that make this protocol useful ;-) > > Best wishes - > S.Osokine. > 2 Dec 2005. > > > -----Original Message----- > From: p2p-hackers-bounces@zgp.org [mailto:p2p-hackers-bounces@zgp.org]On > Behalf Of Alexander L?ser > Sent: Friday, December 02, 2005 1:29 AM > To: Peer-to-peer development. > Subject: Re: [p2p-hackers] Re: scalability > > > Hi Adam, > originally there was a certain type of clustering in the beginnings of > Gnutella (late 90ies) . People communicate its ids mouth to mouth or via > Email or deja news to other people. So in most cases you got Ids from > people which had at least similar interests, or from people where you > expected some interesting files. Later, due to the overwhelming > attractiveness of the gnutella application they introduced the gtk and > other bootstrapping alternatives, given you a number of starting > pointers. However, this starting points a chosen 'randomly', so there is > no longer any clustering by interests. > > We (Berlin and Karlsruhe) developed a new protocol (INGA Interest based > Node Grouping Algorithm [1][2]) , that reclusters the network based on > the interests of the peers, without any DHT, only using on an > unstructured network. Similar to freenet, the network topology evolves > over a while to a so called small world topology, where people with > similar interests are clustered together. In addition, to further speed > up the clustering process, peers also keep in a local index structures > other peers, that are 'HUBs' in the network, e.g. having a high in and > out degree. Our experiments show, that we significantly outperform > Gnutella style approaches in messages even in highly volatile networks. > > Best's Alex > > [1] Searching Dynamic Communities with Personal Indexes. L?ser, Tempich > et.al 3rd. International Semantic Web Conference, Galway. Springer 2005 > http://cis.cs.tu-berlin.de/~aloeser/publications/iswc2005.pdf > [2] Remindin': Semantic query routing in peer-to-peer networks based on > social metaphors. Tempich et.al. WWW 2004, New York. ACM 2004 > http://**www.aifb.uni-karlsruhe.de/ > Publikationen/showPublikation?publ_id=447 > > Ronald Wertlen schrieb: > > > Hi Adam, > > > > perhaps you have not understood my message because you have not > > noticed the focus on "precision and recall" (i.e. search) not the old > > Distributed DB vs. own DB debate. You have also pigeon-holed my email > > with the DHT crowd (*grin*), it couldn't be further from it! > > > > I was arguing in the other direction - which coderman thankfully > > picked up. Gnutella doesn't structure enough, that's all. Sure > > Gnutella beats DHTs on search - I base that observation on a project I > > finished last year - a public prototype that used JXTA and was honed > > for search using super-peers [DFN S2S http://s2s.neofonie.de/ > > (German site) - we've moved on some since them ;) ]. > > > > Gnutella 0.6 (is there a 0.7 protocol, I can't find it?) allows > > practically anyone to elevate to super-peer, which results in a random > > (power-law distribtion) network. Such a network is not going to > > perform very well as far as recall and precision are concerned, past a > > certain point. I would be interested to calculate that exact point > > (but doubting I'll get to it some time soon :-/). > > > > HTH. > > > > Best regards, Ron > > > > PS. seems this thread has driven the original author to reformulate > > his statment... :-) > > > > PPS. > > In fact, the network is not going to be completely random - it will > > follow the contours of the internet (distribution of servers, > > broadband connections, users, etc. is not random). I am not sure if > > that destroys or supports my argument. Back to the drawing board! > > > > We actually need a better internet. [oops there I go getting > > unspecific again, sorry!! ;-) ] > > > > > >> Message: 4 > >> Date: Wed, 30 Nov 2005 16:42:39 -0500 > >> From: Adam Fisk > >> Subject: Re: [p2p-hackers] Re: scalability To: "Peer-to-peer > >> development." > >> Message-ID: <438E1CCF.4010907@speedymail.org> > >> Content-Type: text/plain; charset=ISO-8859-1; format=flowed > >> > >> I don't understand your post. When you say "critical", I assume > >> you're talking about life and death situations? Are you talking > >> about anything specifically? DHTs have failure rates. Ad hoc and > >> mesh networks can become useful in emergency situations where > >> conventional infrastructures break down, but the > >> centralized/p2p/structured/unstructured questions here are far from > >> obvious. > >> > >> On the "obsessive science types" issue, this completely misses the > >> point. It's a very non "obsessive science type" statement. There > >> are strong reasons for using the massive indexing/random walk > >> approach above DHTs -- reasons that have nothing to do with > >> scalability. In particulary, DHTs are, well, hash tables. Hash > >> tables don't work well for metadata queries. They do fine for > >> keywords (hotspots are a problem, but they can be solved), but they > >> aren't as nice a fit for metadata. RDF and DHTs are tough to squeeze > >> together, for example. The massive indexing (mutual index caching to > >> use Serguei's term)/random walk approach can get around these issues > >> more easily. They are also not nearly as brittle as DHTs. Sure, > >> DHTs repair themselves after node joins and leaves, but node > >> transience generally has a much greater effect on DHTs than it does > >> on massive indexing networks. > >> > >> I also think you're underestimating the efficiency of massive > >> indexing and random walks. Sure, these networks don't scale > >> logarithmically, but they do pretty darn well. > >> I encourage everyone to stay specific with their posts. > >> > >> All the Best, > >> > >> Adam > > > > > > > > _______________________________________________ > > p2p-hackers mailing list > > p2p-hackers@zgp.org > > http://zgp.org/mailman/listinfo/p2p-hackers > > _______________________________________________ > > Here is a web page listing P2P Conferences: > > http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences > > > > > -- > ___________________________________________________________ > > Dr. Alexander L?ser, > Technische Universit?t Berlin, > CIS, Sekr. EN 7, Einsteinufer 17, 10587 Berlin, GERMANY > office: +49- 30-314-25556 fax: +49- 30-314-21601 > web: http://cis.cs.tu-berlin.de/~aloeser/ > ___________________________________________________________ > > _______________________________________________ > p2p-hackers mailing list > p2p-hackers@zgp.org > http://zgp.org/mailman/listinfo/p2p-hackers > _______________________________________________ > Here is a web page listing P2P Conferences: > http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences > _______________________________________________ > p2p-hackers mailing list > p2p-hackers@zgp.org > http://zgp.org/mailman/listinfo/p2p-hackers > _______________________________________________ > Here is a web page listing P2P Conferences: > http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences From eugen at leitl.org Fri Dec 2 19:38:33 2005 From: eugen at leitl.org (Eugen Leitl) Date: Sat Dec 9 22:13:05 2006 Subject: [p2p-hackers] p2p portland (OR) In-Reply-To: <4ef5fec60512021113w51fa4c27q73616041994220b0@mail.gmail.com> References: <4ef5fec60512021030r79ec7107j5ece5ea26b4cae85@mail.gmail.com> <20051202185136.GB2604@cs.uoregon.edu> <4ef5fec60512021113w51fa4c27q73616041994220b0@mail.gmail.com> Message-ID: <20051202193833.GD2249@leitl.org> On Fri, Dec 02, 2005 at 11:13:33AM -0800, coderman wrote: > On 12/2/05, Daniel Stutzbach wrote: > > I'm in Eugene. I'd be willing to drive up for a get-together if we > > have a big enough group to make it interesting. > > i'd be happy to travel to eugene if more of the group is located there > as well. weekends would be best in that case. Allright! I'm game. ;) -- Eugen* Leitl leitl ______________________________________________________________ ICBM: 48.07100, 11.36820 http://www.leitl.org 8B29F6BE: 099D 78BA 2FD3 B014 B08A 7779 75B0 2443 8B29 F6BE -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: Digital signature Url : http://zgp.org/pipermail/p2p-hackers/attachments/20051202/9879a674/attachment.pgp From Serguei.Osokine at efi.com Fri Dec 2 19:38:41 2005 From: Serguei.Osokine at efi.com (Serguei Osokine) Date: Sat Dec 9 22:13:05 2006 Subject: [p2p-hackers] Re: scalability Message-ID: <4A60C83D027E224BAA4550FB1A2B120EC42756@fcexmb04.efi.internal> On Friday, December 02, 2005 Greg Bildson wrote: > I have a feeling that pre-centralized hostcache, the network was > more of a long string with some clumps as it went along. So what kept this string from fully clumping as the connections were broken and reestablished? Default was four connections, not two. How is it possible not to fold this string onto itself about one thousand times after the first 1,000 connections will be reestablished - which would take 10-15 minutes in a 1,000-node network, and would happen instantly in a one-million one? > If ToadNode was correct in that they had millions of downloads in > those early days then thats the only way that I could see the modem > bandwidth barrier not getting hit very quickly. Between people not using the downloaded code, an error in ToadNode stats, a miracle, and the network preserving its 'linear' graph topology for any noticeable time, my vote will be for any one of the first three - the last one is too improbable. Best wishes - S.Osokine. 2 Dec 2005. -----Original Message----- From: p2p-hackers-bounces@zgp.org [mailto:p2p-hackers-bounces@zgp.org]On Behalf Of Greg Bildson Sent: Friday, December 02, 2005 11:23 AM To: Peer-to-peer development. Subject: RE: [p2p-hackers] Re: scalability The only locality that I can think of that may have occurred back in that early timeframe would be based on the stringiness of the network. I have a feeling that pre-centralized hostcache, the network was more of a long string with some clumps as it went along. So, its possible that the network diameter at its longest point was much larger than max-TTL. Then, the introduction of centralized hostcaches helped create a massive cluster and exacerbated the early modem bandwidth barrier. This appeared to be what Gene Kan thought I believe. Its was only months later with the introduction of clients with keepalive pings and flow control that the clogged spots got freed up. If ToadNode was correct in that they had millions of downloads in those early days then thats the only way that I could see the modem bandwidth barrier not getting hit very quickly. Thanks -greg > -----Original Message----- > From: p2p-hackers-bounces@zgp.org [mailto:p2p-hackers-bounces@zgp.org]On > Behalf Of Serguei Osokine > Sent: Friday, December 02, 2005 1:23 PM > To: Peer-to-peer development. > Subject: RE: [p2p-hackers] Re: scalability > > > On Friday, December 02, 2005 Alexander L?ser wrote: > > originally there was a certain type of clustering in the beginnings > > of Gnutella (late 90ies) . People communicate its ids mouth to mouth > > or via Email or deja news to other people. So in most cases you got > > Ids from people which had at least similar interests, or from > > people where you expected some interesting files. > > I'm sorry to contradict you, but I think this is all a myth. > > First, there was no Gnutella in late 90ies. It was released in > March of 2000. Second, I remember looking at the connection stability > just a few months later (June/July, maybe?), and the churn was quite > high - a client tended to replace all its connections within an hour > or so. > > Now if you remember how the connections were replaced, the > client was trying the IPs that it received from PONGs, which were > essentially the random network IPs, because the network was just > a few thousand nodes and every client could see the pongs from > pretty much everyone. So in an hour or so your initial connection > point stopped being relevant and you found yourself at a random > place in the network. After that, all your subsequent sessions used > the IP list stored on disk by a previous session to connect to the > network, and the address given to you by your friends was no longer > important. > > To be precise, this latest part (about the IP list) was the > behaviour of the Gnutella clients that I worked with (I think these > were Gnutella v.056 and GNUT). Maybe there were some clients that > required to enter an IP at every session start. I don't know. There > was also a notion of locality based on the unusually good and stable > connections - as soon as the two machines on my desktop would find > each other on the network as a result of this random process, they > would stay connected for quite a while (as long as I did not stop > the clients). > > But even these considerations are not important, because the > early Gnutella (until the meltdown of July 2000) was fully visible, > and every query more or less reached every node (in the absence of > the flow control, this is exactly what caused the meltdown - TTL was > too high to limit the query propagation). > > Of course, some queries might have been missing some nodes, but > generally there was no chance for any clustering - I simply cannot see > how it could possibly exist in such a network. > > > We (Berlin and Karlsruhe) developed a new protocol (INGA Interest > > based Node Grouping Algorithm [1][2]) , that reclusters the network > > based on the interests of the peers, without any DHT, only using on > > an unstructured network. > > Which is cool, and maybe it is a great protocol - as long as > you won't justify its existence by myths. I'm sure there are plenty > of legitimate reasons that make this protocol useful ;-) > > Best wishes - > S.Osokine. > 2 Dec 2005. > > > -----Original Message----- > From: p2p-hackers-bounces@zgp.org [mailto:p2p-hackers-bounces@zgp.org]On > Behalf Of Alexander L?ser > Sent: Friday, December 02, 2005 1:29 AM > To: Peer-to-peer development. > Subject: Re: [p2p-hackers] Re: scalability > > > Hi Adam, > originally there was a certain type of clustering in the beginnings of > Gnutella (late 90ies) . People communicate its ids mouth to mouth or via > Email or deja news to other people. So in most cases you got Ids from > people which had at least similar interests, or from people where you > expected some interesting files. Later, due to the overwhelming > attractiveness of the gnutella application they introduced the gtk and > other bootstrapping alternatives, given you a number of starting > pointers. However, this starting points a chosen 'randomly', so there is > no longer any clustering by interests. > > We (Berlin and Karlsruhe) developed a new protocol (INGA Interest based > Node Grouping Algorithm [1][2]) , that reclusters the network based on > the interests of the peers, without any DHT, only using on an > unstructured network. Similar to freenet, the network topology evolves > over a while to a so called small world topology, where people with > similar interests are clustered together. In addition, to further speed > up the clustering process, peers also keep in a local index structures > other peers, that are 'HUBs' in the network, e.g. having a high in and > out degree. Our experiments show, that we significantly outperform > Gnutella style approaches in messages even in highly volatile networks. > > Best's Alex > > [1] Searching Dynamic Communities with Personal Indexes. L?ser, Tempich > et.al 3rd. International Semantic Web Conference, Galway. Springer 2005 > http://cis.cs.tu-berlin.de/~aloeser/publications/iswc2005.pdf > [2] Remindin': Semantic query routing in peer-to-peer networks based on > social metaphors. Tempich et.al. WWW 2004, New York. ACM 2004 > http://**www.aifb.uni-karlsruhe.de/ > Publikationen/showPublikation?publ_id=447 > > Ronald Wertlen schrieb: > > > Hi Adam, > > > > perhaps you have not understood my message because you have not > > noticed the focus on "precision and recall" (i.e. search) not the old > > Distributed DB vs. own DB debate. You have also pigeon-holed my email > > with the DHT crowd (*grin*), it couldn't be further from it! > > > > I was arguing in the other direction - which coderman thankfully > > picked up. Gnutella doesn't structure enough, that's all. Sure > > Gnutella beats DHTs on search - I base that observation on a project I > > finished last year - a public prototype that used JXTA and was honed > > for search using super-peers [DFN S2S http://s2s.neofonie.de/ > > (German site) - we've moved on some since them ;) ]. > > > > Gnutella 0.6 (is there a 0.7 protocol, I can't find it?) allows > > practically anyone to elevate to super-peer, which results in a random > > (power-law distribtion) network. Such a network is not going to > > perform very well as far as recall and precision are concerned, past a > > certain point. I would be interested to calculate that exact point > > (but doubting I'll get to it some time soon :-/). > > > > HTH. > > > > Best regards, Ron > > > > PS. seems this thread has driven the original author to reformulate > > his statment... :-) > > > > PPS. > > In fact, the network is not going to be completely random - it will > > follow the contours of the internet (distribution of servers, > > broadband connections, users, etc. is not random). I am not sure if > > that destroys or supports my argument. Back to the drawing board! > > > > We actually need a better internet. [oops there I go getting > > unspecific again, sorry!! ;-) ] > > > > > >> Message: 4 > >> Date: Wed, 30 Nov 2005 16:42:39 -0500 > >> From: Adam Fisk > >> Subject: Re: [p2p-hackers] Re: scalability To: "Peer-to-peer > >> development." > >> Message-ID: <438E1CCF.4010907@speedymail.org> > >> Content-Type: text/plain; charset=ISO-8859-1; format=flowed > >> > >> I don't understand your post. When you say "critical", I assume > >> you're talking about life and death situations? Are you talking > >> about anything specifically? DHTs have failure rates. Ad hoc and > >> mesh networks can become useful in emergency situations where > >> conventional infrastructures break down, but the > >> centralized/p2p/structured/unstructured questions here are far from > >> obvious. > >> > >> On the "obsessive science types" issue, this completely misses the > >> point. It's a very non "obsessive science type" statement. There > >> are strong reasons for using the massive indexing/random walk > >> approach above DHTs -- reasons that have nothing to do with > >> scalability. In particulary, DHTs are, well, hash tables. Hash > >> tables don't work well for metadata queries. They do fine for > >> keywords (hotspots are a problem, but they can be solved), but they > >> aren't as nice a fit for metadata. RDF and DHTs are tough to squeeze > >> together, for example. The massive indexing (mutual index caching to > >> use Serguei's term)/random walk approach can get around these issues > >> more easily. They are also not nearly as brittle as DHTs. Sure, > >> DHTs repair themselves after node joins and leaves, but node > >> transience generally has a much greater effect on DHTs than it does > >> on massive indexing networks. > >> > >> I also think you're underestimating the efficiency of massive > >> indexing and random walks. Sure, these networks don't scale > >> logarithmically, but they do pretty darn well. > >> I encourage everyone to stay specific with their posts. > >> > >> All the Best, > >> > >> Adam > > > > > > > > _______________________________________________ > > p2p-hackers mailing list > > p2p-hackers@zgp.org > > http://zgp.org/mailman/listinfo/p2p-hackers > > _______________________________________________ > > Here is a web page listing P2P Conferences: > > http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences > > > > > -- > ___________________________________________________________ > > Dr. Alexander L?ser, > Technische Universit?t Berlin, > CIS, Sekr. EN 7, Einsteinufer 17, 10587 Berlin, GERMANY > office: +49- 30-314-25556 fax: +49- 30-314-21601 > web: http://cis.cs.tu-berlin.de/~aloeser/ > ___________________________________________________________ > > _______________________________________________ > p2p-hackers mailing list > p2p-hackers@zgp.org > http://zgp.org/mailman/listinfo/p2p-hackers > _______________________________________________ > Here is a web page listing P2P Conferences: > http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences > _______________________________________________ > p2p-hackers mailing list > p2p-hackers@zgp.org > http://zgp.org/mailman/listinfo/p2p-hackers > _______________________________________________ > Here is a web page listing P2P Conferences: > http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences _______________________________________________ p2p-hackers mailing list p2p-hackers@zgp.org http://zgp.org/mailman/listinfo/p2p-hackers _______________________________________________ Here is a web page listing P2P Conferences: http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences From bryan.turner at pobox.com Fri Dec 2 20:15:45 2005 From: bryan.turner at pobox.com (Bryan Turner) Date: Sat Dec 9 22:13:05 2006 Subject: [p2p-hackers] Re: scalability In-Reply-To: <20051201205215.GF5300@cs.uoregon.edu> Message-ID: <200512022015.jB2KFj4X024384@rtp-core-1.cisco.com> My $.02 on Gnutella, The Gnutella network will scale fine to 2B nodes. However, I believe without interest clustering or intelligent peer selection, it will become increasingly difficult to find the data you are interested in. IE: I feel the current architecture misses the 'long tail'. (Note that I am not well versed on Gnutella architecture, this opinion is based on papers modeling the math behind Gnutella) I like to find the orthogonal axis in a design, P2P has lots of interesting scalability axis: 1 Scalability in # of nodes 2 Scalability in # of objects 3 Scalability in size of objects 4 Scalability in interest for an object (hot spots) 5 Scalability in bandwidth (protocol overhead, efficiency) etc. BitTorrent captures all but #2, as multiple torrents may require redundant connections to a peer, and torrents that share files cannot also share swarms (not to mention BitTorrent isn't a content search network). Gnutella (I believe) doesn't meet #2,3 and partially #4,5: #2 because it does not cluster related data it will eventually be overwhelmed with content. #3 because it performs full-file transfers instead of block exchanges or partial file transfers #4/5 because clients don't immediately offer partial downloads, thus hot spots have a congestion delay measured in full-file-transfer increments rather than in block increments (an order of 2 for typical MP3s, easily reaching multiple days of congestion). A vision for a network that scales along all axis would be Gnutella with some structure to improve domain-specific searches, with BitTorrent as the data transfer mechanism. Please educate me if I've missed some facet of Gnutella! --Bryan bryan.turner@pobox.com -----Original Message----- From: p2p-hackers-bounces@zgp.org [mailto:p2p-hackers-bounces@zgp.org] On Behalf Of Daniel Stutzbach Sent: Thursday, December 01, 2005 3:52 PM To: p2p-hackers@zgp.org Subject: Re: [p2p-hackers] Re: scalability On Thu, Dec 01, 2005 at 09:48:45PM +0100, Ronald Wertlen wrote: > Gnutella 0.6 (is there a 0.7 protocol, I can't find it?) allows > practically anyone to elevate to super-peer, which results in a random > (power-law distribtion) network. Gnutella is not a power-law network. See my paper on the graph properties of Gnutella, presented at the Internet Measurement Conference earlier this year: http://www.usenix.org/events/imc05/tech/stutzbach.html > Such a network is not going to perform very well as far as recall and > precision are concerned, past a certain point. I would be interested > to calculate that exact point (but doubting I'll get to it some time > soon :-/). Could you rigorously define recall and precision for me? I'm not sure what you mean by these terms. -- Daniel Stutzbach Computer Science Ph.D Student http://www.barsoom.org/~agthorr University of Oregon From agthorr at cs.uoregon.edu Fri Dec 2 20:22:23 2005 From: agthorr at cs.uoregon.edu (Daniel Stutzbach) Date: Sat Dec 9 22:13:05 2006 Subject: [p2p-hackers] Re: scalability In-Reply-To: <200512022015.jB2KFj4X024384@rtp-core-1.cisco.com> References: <20051201205215.GF5300@cs.uoregon.edu> <200512022015.jB2KFj4X024384@rtp-core-1.cisco.com> Message-ID: <20051202202223.GC2604@cs.uoregon.edu> On Fri, Dec 02, 2005 at 03:15:45PM -0500, Bryan Turner wrote: > Gnutella (I believe) doesn't meet #2,3 and partially #4,5: > #2 because it does not cluster related data it will eventually > be overwhelmed with content. > #3 because it performs full-file transfers instead of block > exchanges or partial file transfers > #4/5 because clients don't immediately offer partial downloads, > thus hot spots have a congestion delay measured in > full-file-transfer increments rather than in block > increments (an order of 2 for typical MP3s, easily > reaching multiple days of congestion). If I am not mistaken, Gnutella has been doing partial file transfers for two or three years now. The eDonkey/eMule network does this too. BitTorrent does not have a monopoly on this feature. :-) The relevant spec (if it can be called a spec) for Gnutella is here: http://www.the-gdf.org/wiki/index.php?title=Partial_File_Sharing_Protocol -- Daniel Stutzbach Computer Science Ph.D Student http://www.barsoom.org/~agthorr University of Oregon From bryan.turner at pobox.com Fri Dec 2 20:25:59 2005 From: bryan.turner at pobox.com (Bryan Turner) Date: Sat Dec 9 22:13:05 2006 Subject: [p2p-hackers] DHTs in highly-transient networks In-Reply-To: Message-ID: <200512022026.jB2KPx4X026153@rtp-core-1.cisco.com> The DHTs that I've studied behave well in high-churn environments. The problem is network migration events; large swings of population in a short time. Chord is the worst for this, as its rigid structure quickly buckles when you lose a large chunk of the network. Kademlia survives pretty well; maintaining connections with long-lived nodes is a definite win, as is maintaining connectivity to hubs/supernodes. All of them get screwed when large populations join. The network turns to chaos for a while until things settle down. Kademlia is better off (lookups continue to work). The largest problem is the sudden lack of bandwidth due to all the key-transfers between the nodes. In my implementations I had to add a 'slop' factor that was larger than my largest expected node-join event. During a lookup, if the 'ultimate' node didn't have the data, he passed the request through the oldest couple of nodes in the slop region. This allowed one last chance to find the right owner. It worked well in practice, but I still believe there's a better way. --Bryan bryan.turner@pobox.com -----Original Message----- From: p2p-hackers-bounces@zgp.org [mailto:p2p-hackers-bounces@zgp.org] On Behalf Of Sean Rhea Sent: Thursday, December 01, 2005 6:02 PM To: Peer-to-peer development. Subject: Re: [p2p-hackers] DHTs in highly-transient networks On Dec 1, 2005, at 5:53 PM, Michael Rogers wrote: > To what extent does this depend on the distribution of session times > as well as the mean? Kademlia assumes that old nodes will outlive new > nodes, and Daniel's paper shows that Gnutella contains an emergent > core of long-lived nodes - how well do Bamboo and Chord survive under > non-uniform churn? We used exponentially-distributed node lifetimes, so old nodes do not generally outlive new ones. However, I _think_ that choice only makes the problem harder, though. In particular, I would suspect that Bamboo/Chord would do just as well if old nodes lived longer than new ones, and possibly better. They won't take advantage of it like Kademlia does, but it shouldn't hurt them either. (At least that's my guess; I don't have data to prove it.) Sean -- When I see the price that you pay / I don't wanna grow up -- Tom Waits From coderman at gmail.com Fri Dec 2 20:30:11 2005 From: coderman at gmail.com (coderman) Date: Sat Dec 9 22:13:05 2006 Subject: [p2p-hackers] Re: scalability In-Reply-To: <200512022015.jB2KFj4X024384@rtp-core-1.cisco.com> References: <20051201205215.GF5300@cs.uoregon.edu> <200512022015.jB2KFj4X024384@rtp-core-1.cisco.com> Message-ID: <4ef5fec60512021230t2408884ew2e6002cc61fa6d92@mail.gmail.com> On 12/2/05, Bryan Turner wrote: > ... > 4 Scalability in interest for an object (hot spots) >... > A vision for a network that scales along all axis would be Gnutella > with some structure to improve domain-specific searches, with BitTorrent as > the data transfer mechanism. finding obscure / rare / unpopular resources is the flip side of the interest coin. in alpine all discovery was done using distinct peer groups dedicated to a single domain of resource discovery (specific subjects / applications had distinct groups). peer lists were ordered within each group according to a relative quality attribute associated with that group only. the goal was to make decentralized search efficient for very obscure resources when a centralized (or partially centralized) index search was usually required for completeness to make it effective. the problem with this approach is that it is very hard to model in a meaningful way due to inherent dependence on relative metrics associated with human behavior. (or perhaps it will be simple(r) if a large real world network can be observed and studied) alpine also used a pluggable module system (dlopen with c++ derived handlers) to handle arbitrary metadata associated with queries (different groups may require different search criteria and taxonomy) and integrate various transport mechanisms (a simple TCP stream transfer was provided as an example of this ability) being able to offload such transfers to a system optimized for the purpose, like bittorrent, was a design goal and definitely makes sense in any project where cooperative content distribution is useful. From gbildson at limepeer.com Fri Dec 2 20:35:11 2005 From: gbildson at limepeer.com (Greg Bildson) Date: Sat Dec 9 22:13:05 2006 Subject: [p2p-hackers] Re: scalability In-Reply-To: <200512022015.jB2KFj4X024384@rtp-core-1.cisco.com> Message-ID: Those suppositions are fairly misplaced as is most academic work on Gnutella. I wouldn't believe any (other than Daniel Stutzbach's) academic papers describing Gnutella. Partial file sharing is active by default. Download meshes are in place. Download chunking (pseudo-random) is in place - not rarest first but sufficient in many cases. Many improvements have been made to increase the awareness and allocation of resources but improvements can still be made. You are correct that rare file/topic searches are still not great but are much better than historically and likely better than similar networks. Dynamic querying does a good job of satisfying popular requests at low cost and reserving more horsepower for rarer searches. Efficiency is pretty good. Bittorrent is a tad verbose in some respects. The only important things that are not in place in Gnutella are rarest first and tit for tat. Thanks -greg > -----Original Message----- > From: p2p-hackers-bounces@zgp.org [mailto:p2p-hackers-bounces@zgp.org]On > Behalf Of Bryan Turner > Sent: Friday, December 02, 2005 3:16 PM > To: 'Peer-to-peer development.' > Subject: RE: [p2p-hackers] Re: scalability > > > My $.02 on Gnutella, > > The Gnutella network will scale fine to 2B nodes. However, I > believe without interest clustering or intelligent peer selection, it will > become increasingly difficult to find the data you are interested > in. IE: I > feel the current architecture misses the 'long tail'. (Note that I am not > well versed on Gnutella architecture, this opinion is based on papers > modeling the math behind Gnutella) > > I like to find the orthogonal axis in a design, P2P has lots of > interesting scalability axis: > 1 Scalability in # of nodes > 2 Scalability in # of objects > 3 Scalability in size of objects > 4 Scalability in interest for an object (hot spots) > 5 Scalability in bandwidth (protocol overhead, efficiency) > etc. > > BitTorrent captures all but #2, as multiple torrents may require > redundant connections to a peer, and torrents that share files cannot also > share swarms (not to mention BitTorrent isn't a content search network). > > Gnutella (I believe) doesn't meet #2,3 and partially #4,5: > #2 because it does not cluster related data it will eventually > be overwhelmed with content. > #3 because it performs full-file transfers instead of block > exchanges or partial file transfers > #4/5 because clients don't immediately offer partial downloads, > thus hot spots have a congestion delay measured in > full-file-transfer increments rather than in block > increments (an order of 2 for typical MP3s, easily > reaching multiple days of congestion). > > A vision for a network that scales along all axis would be Gnutella > with some structure to improve domain-specific searches, with > BitTorrent as > the data transfer mechanism. > > Please educate me if I've missed some facet of Gnutella! > --Bryan > bryan.turner@pobox.com > > -----Original Message----- > From: p2p-hackers-bounces@zgp.org [mailto:p2p-hackers-bounces@zgp.org] On > Behalf Of Daniel Stutzbach > Sent: Thursday, December 01, 2005 3:52 PM > To: p2p-hackers@zgp.org > Subject: Re: [p2p-hackers] Re: scalability > > On Thu, Dec 01, 2005 at 09:48:45PM +0100, Ronald Wertlen wrote: > > Gnutella 0.6 (is there a 0.7 protocol, I can't find it?) allows > > practically anyone to elevate to super-peer, which results in a random > > (power-law distribtion) network. > > Gnutella is not a power-law network. See my paper on the graph properties > of Gnutella, presented at the Internet Measurement Conference earlier this > year: > > http://www.usenix.org/events/imc05/tech/stutzbach.html > > > Such a network is not going to perform very well as far as recall and > > precision are concerned, past a certain point. I would be interested > > to calculate that exact point (but doubting I'll get to it some time > > soon :-/). > > Could you rigorously define recall and precision for me? I'm not > sure what > you mean by these terms. > > -- > Daniel Stutzbach Computer Science Ph.D Student > http://www.barsoom.org/~agthorr University of Oregon > _______________________________________________ > p2p-hackers mailing list > p2p-hackers@zgp.org > http://zgp.org/mailman/listinfo/p2p-hackers > _______________________________________________ > Here is a web page listing P2P Conferences: > http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences From bryan.turner at pobox.com Fri Dec 2 20:40:59 2005 From: bryan.turner at pobox.com (Bryan Turner) Date: Sat Dec 9 22:13:05 2006 Subject: [p2p-hackers] Re: scalability In-Reply-To: <20051202202223.GC2604@cs.uoregon.edu> Message-ID: <200512022041.jB2Kex4X028301@rtp-core-1.cisco.com> Ah, this is news to me :) Thanks for the link. I notice that this partial file transfer feature is only a footnote on the main protocol.. How wide spread is the partial file transfer feature among clients? --Bryan bryan.turner@pobox.com -----Original Message----- From: p2p-hackers-bounces@zgp.org [mailto:p2p-hackers-bounces@zgp.org] On Behalf Of Daniel Stutzbach Sent: Friday, December 02, 2005 3:22 PM To: 'Peer-to-peer development.' Subject: Re: [p2p-hackers] Re: scalability On Fri, Dec 02, 2005 at 03:15:45PM -0500, Bryan Turner wrote: > Gnutella (I believe) doesn't meet #2,3 and partially #4,5: > #2 because it does not cluster related data it will eventually > be overwhelmed with content. > #3 because it performs full-file transfers instead of block > exchanges or partial file transfers > #4/5 because clients don't immediately offer partial downloads, > thus hot spots have a congestion delay measured in > full-file-transfer increments rather than in block > increments (an order of 2 for typical MP3s, easily > reaching multiple days of congestion). If I am not mistaken, Gnutella has been doing partial file transfers for two or three years now. The eDonkey/eMule network does this too. BitTorrent does not have a monopoly on this feature. :-) The relevant spec (if it can be called a spec) for Gnutella is here: http://www.the-gdf.org/wiki/index.php?title=Partial_File_Sharing_Protocol -- Daniel Stutzbach Computer Science Ph.D Student http://www.barsoom.org/~agthorr University of Oregon _______________________________________________ p2p-hackers mailing list p2p-hackers@zgp.org http://zgp.org/mailman/listinfo/p2p-hackers _______________________________________________ Here is a web page listing P2P Conferences: http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences From sberlin at gmail.com Fri Dec 2 21:11:53 2005 From: sberlin at gmail.com (Sam Berlin) Date: Sat Dec 9 22:13:05 2006 Subject: [p2p-hackers] Re: scalability In-Reply-To: <200512022041.jB2Kex4X028301@rtp-core-1.cisco.com> References: <20051202202223.GC2604@cs.uoregon.edu> <200512022041.jB2Kex4X028301@rtp-core-1.cisco.com> Message-ID: <19196d860512021311u2344d593v77d26fd98a642e90@mail.gmail.com> The protocol, as described nearly anywhere, isn't Gnutella. Gnutella, as others have said, really isn't a protocol (0.6 or any number) anymore. It's a hodgepodge of a lot of features, all implemented by various Gnutella clients. Partial file sharing has been in use by mainstream clients for around 1-2 years. As Greg mentioned, academic papers tend to describe Gnutella as it was designed by Justin Frankel, and a few will include the addition of ultrapeers. It's nearly impossible to find a paper that accurately describes the current state of the network (as it exists through mainstream clients) though. It'd likely be a fascinating subject for researchers to study & write papers on. I know I'd be interested. Sam On 12/2/05, Bryan Turner wrote: > Ah, this is news to me :) Thanks for the link. I notice that this > partial file transfer feature is only a footnote on the main protocol.. How > wide spread is the partial file transfer feature among clients? > > --Bryan > bryan.turner@pobox.com > > -----Original Message----- > From: p2p-hackers-bounces@zgp.org [mailto:p2p-hackers-bounces@zgp.org] On > Behalf Of Daniel Stutzbach > Sent: Friday, December 02, 2005 3:22 PM > To: 'Peer-to-peer development.' > Subject: Re: [p2p-hackers] Re: scalability > > On Fri, Dec 02, 2005 at 03:15:45PM -0500, Bryan Turner wrote: > > Gnutella (I believe) doesn't meet #2,3 and partially #4,5: > > #2 because it does not cluster related data it will eventually > > be overwhelmed with content. > > #3 because it performs full-file transfers instead of block > > exchanges or partial file transfers > > #4/5 because clients don't immediately offer partial downloads, > > thus hot spots have a congestion delay measured in > > full-file-transfer increments rather than in block > > increments (an order of 2 for typical MP3s, easily > > reaching multiple days of congestion). > > If I am not mistaken, Gnutella has been doing partial file transfers for two > or three years now. The eDonkey/eMule network does this too. > > BitTorrent does not have a monopoly on this feature. :-) > > The relevant spec (if it can be called a spec) for Gnutella is here: > > http://www.the-gdf.org/wiki/index.php?title=Partial_File_Sharing_Protocol > > -- > Daniel Stutzbach Computer Science Ph.D Student > http://www.barsoom.org/~agthorr University of Oregon > _______________________________________________ > p2p-hackers mailing list > p2p-hackers@zgp.org > http://zgp.org/mailman/listinfo/p2p-hackers > _______________________________________________ > Here is a web page listing P2P Conferences: > http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences > _______________________________________________ > p2p-hackers mailing list > p2p-hackers@zgp.org > http://zgp.org/mailman/listinfo/p2p-hackers > _______________________________________________ > Here is a web page listing P2P Conferences: > http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences > From Serguei.Osokine at efi.com Fri Dec 2 21:26:00 2005 From: Serguei.Osokine at efi.com (Serguei Osokine) Date: Sat Dec 9 22:13:05 2006 Subject: [p2p-hackers] Re: scalability Message-ID: <4A60C83D027E224BAA4550FB1A2B120EC42759@fcexmb04.efi.internal> On Friday, December 02, 2005 Sam Berlin wrote: > It'd likely be a fascinating subject for researchers to study & write > papers on. I know I'd be interested. Yeah, well, O'Reilly wasn't :-) I submitted a proposal to the ETC two or three years ago, where I was going to talk about Gnutella being the first P2P network that is not only deployed and developed, but is also *designed* in a fully decentralized fashion. Like you say, basically - there is some common protocol framework, but within this framework vendors are free to develop, publish, and deploy their own protocol extensions, and to implement only those extensions of the others that they like. Survival of the fittest proposals in the field, so to speak. Design without an architectural committee, voting, or any kind of central authority or even consensus on half of the issues. This is a first and only example of such development, as far as I know. But for some reason O'Reilly was not impressed. Though I'm not much of a speaker in any case :-) Best wishes - S.Osokine. 2 Nov 2005. -----Original Message----- From: p2p-hackers-bounces@zgp.org [mailto:p2p-hackers-bounces@zgp.org]On Behalf Of Sam Berlin Sent: Friday, December 02, 2005 1:12 PM To: Bryan Turner; Peer-to-peer development. Subject: Re: [p2p-hackers] Re: scalability The protocol, as described nearly anywhere, isn't Gnutella. Gnutella, as others have said, really isn't a protocol (0.6 or any number) anymore. It's a hodgepodge of a lot of features, all implemented by various Gnutella clients. Partial file sharing has been in use by mainstream clients for around 1-2 years. As Greg mentioned, academic papers tend to describe Gnutella as it was designed by Justin Frankel, and a few will include the addition of ultrapeers. It's nearly impossible to find a paper that accurately describes the current state of the network (as it exists through mainstream clients) though. It'd likely be a fascinating subject for researchers to study & write papers on. I know I'd be interested. Sam On 12/2/05, Bryan Turner wrote: > Ah, this is news to me :) Thanks for the link. I notice that this > partial file transfer feature is only a footnote on the main protocol.. How > wide spread is the partial file transfer feature among clients? > > --Bryan > bryan.turner@pobox.com > > -----Original Message----- > From: p2p-hackers-bounces@zgp.org [mailto:p2p-hackers-bounces@zgp.org] On > Behalf Of Daniel Stutzbach > Sent: Friday, December 02, 2005 3:22 PM > To: 'Peer-to-peer development.' > Subject: Re: [p2p-hackers] Re: scalability > > On Fri, Dec 02, 2005 at 03:15:45PM -0500, Bryan Turner wrote: > > Gnutella (I believe) doesn't meet #2,3 and partially #4,5: > > #2 because it does not cluster related data it will eventually > > be overwhelmed with content. > > #3 because it performs full-file transfers instead of block > > exchanges or partial file transfers > > #4/5 because clients don't immediately offer partial downloads, > > thus hot spots have a congestion delay measured in > > full-file-transfer increments rather than in block > > increments (an order of 2 for typical MP3s, easily > > reaching multiple days of congestion). > > If I am not mistaken, Gnutella has been doing partial file transfers for two > or three years now. The eDonkey/eMule network does this too. > > BitTorrent does not have a monopoly on this feature. :-) > > The relevant spec (if it can be called a spec) for Gnutella is here: > > http://www.the-gdf.org/wiki/index.php?title=Partial_File_Sharing_Protocol > > -- > Daniel Stutzbach Computer Science Ph.D Student > http://www.barsoom.org/~agthorr University of Oregon > _______________________________________________ > p2p-hackers mailing list > p2p-hackers@zgp.org > http://zgp.org/mailman/listinfo/p2p-hackers > _______________________________________________ > Here is a web page listing P2P Conferences: > http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences > _______________________________________________ > p2p-hackers mailing list > p2p-hackers@zgp.org > http://zgp.org/mailman/listinfo/p2p-hackers > _______________________________________________ > Here is a web page listing P2P Conferences: > http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences > _______________________________________________ p2p-hackers mailing list p2p-hackers@zgp.org http://zgp.org/mailman/listinfo/p2p-hackers _______________________________________________ Here is a web page listing P2P Conferences: http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences From agthorr at cs.uoregon.edu Fri Dec 2 21:30:52 2005 From: agthorr at cs.uoregon.edu (Daniel Stutzbach) Date: Sat Dec 9 22:13:05 2006 Subject: [p2p-hackers] Re: scalability In-Reply-To: <19196d860512021311u2344d593v77d26fd98a642e90@mail.gmail.com> References: <20051202202223.GC2604@cs.uoregon.edu> <200512022041.jB2Kex4X028301@rtp-core-1.cisco.com> <19196d860512021311u2344d593v77d26fd98a642e90@mail.gmail.com> Message-ID: <20051202213051.GF2604@cs.uoregon.edu> Perhaps we should take a cue from TCP/IP and start referring to the "Gnutella protocol suite". On Fri, Dec 02, 2005 at 04:11:53PM -0500, Sam Berlin wrote: > The protocol, as described nearly anywhere, isn't Gnutella. Gnutella, > as others have said, really isn't a protocol (0.6 or any number) > anymore. It's a hodgepodge of a lot of features, all implemented by > various Gnutella clients. Partial file sharing has been in use by > mainstream clients for around 1-2 years. > > As Greg mentioned, academic papers tend to describe Gnutella as it was > designed by Justin Frankel, and a few will include the addition of > ultrapeers. It's nearly impossible to find a paper that accurately > describes the current state of the network (as it exists through > mainstream clients) though. > > It'd likely be a fascinating subject for researchers to study & write > papers on. I know I'd be interested. > > Sam > > On 12/2/05, Bryan Turner wrote: > > Ah, this is news to me :) Thanks for the link. I notice that this > > partial file transfer feature is only a footnote on the main protocol.. How > > wide spread is the partial file transfer feature among clients? > > > > bryan.turner@pobox.com > > > > From: p2p-hackers-bounces@zgp.org [mailto:p2p-hackers-bounces@zgp.org] On > > Behalf Of Daniel Stutzbach > > Sent: Friday, December 02, 2005 3:22 PM > > To: 'Peer-to-peer development.' > > Subject: Re: [p2p-hackers] Re: scalability > > > > On Fri, Dec 02, 2005 at 03:15:45PM -0500, Bryan Turner wrote: > > > Gnutella (I believe) doesn't meet #2,3 and partially #4,5: > > > #2 because it does not cluster related data it will eventually > > > be overwhelmed with content. > > > #3 because it performs full-file transfers instead of block > > > exchanges or partial file transfers > > > #4/5 because clients don't immediately offer partial downloads, > > > thus hot spots have a congestion delay measured in > > > full-file-transfer increments rather than in block > > > increments (an order of 2 for typical MP3s, easily > > > reaching multiple days of congestion). > > > > If I am not mistaken, Gnutella has been doing partial file transfers for two > > or three years now. The eDonkey/eMule network does this too. > > > > BitTorrent does not have a monopoly on this feature. :-) > > > > The relevant spec (if it can be called a spec) for Gnutella is here: > > > > http://www.the-gdf.org/wiki/index.php?title=Partial_File_Sharing_Protocol > > > > > _______________________________________________ > p2p-hackers mailing list > p2p-hackers@zgp.org > http://zgp.org/mailman/listinfo/p2p-hackers > _______________________________________________ > Here is a web page listing P2P Conferences: > http://www.neurogrid.net/twiki/bin/view/Main/PeerToPeerConferences > -- Dan