[p2p-hackers] Non-transitivity in DHTs

Michael J. Freedman mfreed at cs.nyu.edu
Tue Nov 1 21:18:49 UTC 2005


On Mon, 31 Oct 2005, Alen Peacock wrote:

>  I'm not sure if I fully understand inconsistent roots as it relates
> to kademlia.  Parallel iterative queries seem to make the probability
> of getting an incorrect response from inconsistent roots less severe,
> and implementing some sort of simple consensus easy -- since S directs
> the routing process, it can compare the results from multiple
> responses.  Does that seem right, or am I overlooking something more
> essential to the problem?

Well, the basic problem arises in how you are using Kademlia at the 
higher-level: write-once vs. semi-mutable, single vs. replicated.

The Coral implementation (with its "sloppy" DHT structure) uses the 
routing structure to attempt to find *the* closest responsible node. If 
this one node is invisible to you due to non-transitivity, this is exactly 
where the problem arises.  One solution when using iterative lookup is to 
try one-hop forwarding, i.e., if you can't connect directly, try to get 
some of the node's immediate neighbors in the ID space to forward your 
request and the subsequent response.  (Invisible nodes along a lookup path 
also degrade *performance* for iterative lookup; we describe using a 
negative-result cache in the paper to reduce the performance hit of such 
nodes.  But certainly, the *correctness* problem when these invisible 
nodes are the roots is much more severe.)

Petar's original paper describes a DHT for Kademlia in which put () was 
executed to the k closest nodes (where, say, k=20 for large systems). 
Now, if I remember correctly, this largely assumed that keys were unique 
to values, i.e., hash(value), such that each key had a unique value. 
Thus, if you find the key at any one of these k nodes, you're done.

For Coral and OpenDHT, one could store multiple values under each key (put 
() was really an append), thus, these k nodes could be storing different 
values.  Even if you are just implementing put() as a set, if you want to 
support *overwriting* old values (as opposed to a write-once model of 
hash(value)), this problem still exists.

In the WORLDS paper, we describe how these k nodes can continually monitor 
one another and "fix" up these inconsistencies, so that all k nodes 
provide relatively good consistency properties (although these puts are 
certainly not atomic over the k nodes).  Your suggestion of a "consensus" 
algorithm is similar, although it runs on the "client-side"---so that 
every node doing a get() would need to perform such a "consensus"---as 
opposed to on the server-side as we suggest.

I'm glad you liked the paper,
--mike


-----
www.michaelfreedman.org                              www.coralcdn.org



More information about the P2p-hackers mailing list