[p2p-hackers] Non-transitivity in DHTs
Michael J. Freedman
mfreed at cs.nyu.edu
Tue Nov 1 21:18:49 UTC 2005
On Mon, 31 Oct 2005, Alen Peacock wrote:
> I'm not sure if I fully understand inconsistent roots as it relates
> to kademlia. Parallel iterative queries seem to make the probability
> of getting an incorrect response from inconsistent roots less severe,
> and implementing some sort of simple consensus easy -- since S directs
> the routing process, it can compare the results from multiple
> responses. Does that seem right, or am I overlooking something more
> essential to the problem?
Well, the basic problem arises in how you are using Kademlia at the
higher-level: write-once vs. semi-mutable, single vs. replicated.
The Coral implementation (with its "sloppy" DHT structure) uses the
routing structure to attempt to find *the* closest responsible node. If
this one node is invisible to you due to non-transitivity, this is exactly
where the problem arises. One solution when using iterative lookup is to
try one-hop forwarding, i.e., if you can't connect directly, try to get
some of the node's immediate neighbors in the ID space to forward your
request and the subsequent response. (Invisible nodes along a lookup path
also degrade *performance* for iterative lookup; we describe using a
negative-result cache in the paper to reduce the performance hit of such
nodes. But certainly, the *correctness* problem when these invisible
nodes are the roots is much more severe.)
Petar's original paper describes a DHT for Kademlia in which put () was
executed to the k closest nodes (where, say, k=20 for large systems).
Now, if I remember correctly, this largely assumed that keys were unique
to values, i.e., hash(value), such that each key had a unique value.
Thus, if you find the key at any one of these k nodes, you're done.
For Coral and OpenDHT, one could store multiple values under each key (put
() was really an append), thus, these k nodes could be storing different
values. Even if you are just implementing put() as a set, if you want to
support *overwriting* old values (as opposed to a write-once model of
hash(value)), this problem still exists.
In the WORLDS paper, we describe how these k nodes can continually monitor
one another and "fix" up these inconsistencies, so that all k nodes
provide relatively good consistency properties (although these puts are
certainly not atomic over the k nodes). Your suggestion of a "consensus"
algorithm is similar, although it runs on the "client-side"---so that
every node doing a get() would need to perform such a "consensus"---as
opposed to on the server-side as we suggest.
I'm glad you liked the paper,
--mike
-----
www.michaelfreedman.org www.coralcdn.org
More information about the P2p-hackers
mailing list