[p2p-hackers] Re: Kademlia
mgp at ucla.edu
Sun Dec 19 04:30:25 UTC 2004
Kademlia, Chord, Pastry... They all have different underlying geometries,
but there's not much difference between them when it comes to what
matters: They all route messages in O(log N) time with O(log N)
connectivity and state per node. Just a few comments though...
First, the idea where Kademlia fills its tree with nodes that have been
online the longest can also be emulated in Pastry -- If you have some row
and column of your routing table filled with a live node on the network,
and a new node contacts you to notify you of its arrival which also
belongs in the same row and column, simply disregard the notification as
opposed to replacing the current live entry (or just note its IP address
for later in case your current entry dies). In this way, older nodes are
not replaced by new nodes, just like in Kademlia. In the _default_ Chord
algorithm, however, you can't do this -- fingers in your routing table
must point to the first nodes that succeed your position plus half the
ring, plus a quarter of the ring, plus an eighth of the ring, etc...
Although this requirement can be relaxed while still maintaining O(log N)
lookup time [see "Designing a DHT for low latency and high throughput" by
Dabek et. al]. If you do that, then I suspect you can weave prioritizing
oldest nodes first into Chord as well.
Second, using the default implementation of Pastry (called FreePastry,
available through Rice) performs poorly under networks with heavy churn.
Basically, if nodes join or leave the network too fast, remaining nodes
waste so much bandwidth and effort on keeping their leaf sets current
that most messages fail delivery. The Bamboo DHT, which slightly modifies
the Pastry join algorithm, uses an 'epidemic' approach to updating leaf
sets and solves this problem. See Publications at www.bamboo-dht.org for
Third, and this is just my naive opinion -- Kademlia seems more
complicated to implement than Chord or Pastry. If I recall correctly, you
have a concurrency parameter introducing parallel lookups, publishing and
lookups across k nodes instead of 1, and kind of an ad hoc solution to
imbalanced trees... This doesn't mean I'm bagging on Kademlia -- its
advantages include being the most popular DHT in practice (see eMule --
so it is definitely a proven protocol _in practice_, which is huge), you
don't have to worry about maintaining the consistency of a ring (although
this may not be much of an issue anyway), and you can fill your routing
table "passively". I'm just saying the protocol seems complicated in
comparison, not that it's worse. Again, this is just me spouting off my
naive opinion ;)
So for high performance DHTs where anonymity doesn't matter, I think
Kademlia, Chord, and Pastry (Bamboo) are all perfectly reasonable
options. In my upcoming p2p app (heh, whenever that will be completed)
I'm using the Bamboo protocol -- but this stems from learning Pastry
before either Chord and Kademlia, so I felt more comfortable implementing it.
On Sat, 18 Dec 2004 20:38:06 +0000 Continuity wrote:
> Peter Hunt wrote:
> > Do you have another DHT algorithm you can suggest? What's the "best"
> > one right now?
> Honestly, I don't know either enough about the subject or enough
> different algorithms to be able to answer that.. I'm quite (very) new to
> the subject.. I like some things about kademlia - up-time is inversely
> proportional to the probability of nodes going offline soon, for
> example.. so don't let me put you off looking at it, it's just that
> cleverer people than me have said it's not perfect :)
> Both chord and pastry(tapestry? I think it's the same 'family') seem
> active and well-researched, but I don't know much about their design.
> If anyone else could answer this question better, or at least give an
> overview of what realistic options developers have right now, that would
> be nice =)
> p2p-hackers mailing list
> p2p-hackers at zgp.org
> Here is a web page listing P2P Conferences:
More information about the P2p-hackers