[p2p-hackers] Hybrid Static/Dynamic DHT Systems
Cortland Klein
me at pixelcort.com
Thu Apr 14 07:29:30 UTC 2005
This design proposal aims to describe a DHT system which has both a
functional mutable hash table and a replicating static content hash table.
The problem I see with traditional DHT systems is that they can not
easilly replicate data without making it much more difficult to maintain
up-to-date state. In addition, it becomes impossible to provide an
immutable reference to particular data.
A Hybrid DHT System would have two subsystems, the traditional dynamic
mutable hash table and the added replicating static content hash table.
The static content hash table is a simple table where the keys are the
SHA-1 of the content and the content is a length prefixed amount of data
limited to 256KB. When a key is requested, it's value is automatically
cached among the path of nodes to the node requesting the data. Nodes
maintain their own caches and replace old keys with new keys as they
pass through. Old keys are chosen based not only on their last request
time, but also how close the key is to the node's ID. If the node is
'responsible' for a key, it remains in the cache indefinitely. Thus, we
can achieve replication while ensuring at least one copy is always on
the network. Data larger than 256KB is split up into chunks and a
splitfile containing a list of the static content keys for each chunk is
used to represent the data.
The dynamic mutable hash table provides for each key a revisioned list
of pointers to static content keys. Thus, each dynamic key's content has
a table with datestamps and static content keys. To get the current
value for a dynamic key, the system then does a lookup for the static
content key. The node responsible for the key maintains this revisioned
list and updates it when a put is made on the key. It can then expire
old revisions after both a set number of revisions and at least a
certain amount of time have passed.
My reasoning behind having two tables is that dynamic data cannot
easilly be replicated due to its mutability, however it can point to
static data which can be replicated. By having the static data retain a
'responsible' node, we can ensure that at least one copy of the static
data is available.
Any thoughts?
--
Cortland Klein <me at pixelcort.com> (XMPP/SMTP)
nuWeb Project Founder - http://nuweb.org/ "Infinite Scalability"
Journal - http://pixelcort.com/ "Ranting and Raving"
The geeks shall take over the world and the bits shall be set free!
More information about the P2p-hackers
mailing list