[p2p-hackers] References for using hashes as unique identifiers?
Benja Fallenstein
b.fallenstein at gmx.de
Mon Mar 8 16:55:01 UTC 2004
Mark S. Miller wrote:
> At 12:02 AM 3/8/2004 Monday, Brad Neuberg wrote:
>>That's an interesting reference to Xanadu.
I agree. I was completely unaware that Xu ever used hashes as identifiers.
(I do remember Ted talking about "Merkle hash enfilades" somewhere,
which never made sense to me because WIDative enfilade operations are
supposed to be horizontally associative, which hashing definitely is not?)
>>Can you
>>talk more about how and why the Xanadu team came up
>>with this concept in regards to their own vision of a
>>hypertext web?
>
> A link from document X to document Y must somehow designate document Y. It
> had never occurred to us that the designator should depend on specifying
> what machine Y is hosted on. That would obviously be insane. Documents must
> survive longer than their hosting machine and their hosting organization.
> Besides, popular documents would make for hotspots in the network -- an
> obviously unscalable architecture.
But the address exposed at least to the client in Udanax Green is the
tumbler, which contains the hosting organization if not the machine --
how did you plan to securely map those to the location-independent
cryptographic identifiers?
> The architecture now known as Udanax Gold
...
> has two concepts corresponding to what we mean by "document": The "work" and
> the "edition". A link can link to either kind of thing.
Weren't the links to spans of media, and by implication to all works and
editions that transclude these spans?
This is something I'd really like to understand -- did your system allow
linking to a spanset in a *particular* work/edition? The above phrasing
seems to imply that.
> An edition is an immutable snapshot of the contents of a document as of some
> moment. We planned to eventually designate an edition by the cryptohash of
> its contents. Nevermind that we made other decisions that would have made this
> impossible -- we didn't notice the conflicts at the time. Though we didn't
> yet know Zooko's phrase "self-authenticating designator", we had the
> concept, and we thought we were on a course to achieve it.
Essentially, Freenet's CHK (i.e., hash-based key) and SSK (i.e.,
signature-based key).
How were you going to find who in the network has a copy of the document
identified by a particular hash? Flooding broadcast, or something like
Freenet, or did you actually develop distributed hashtables back then,
too? :-)
It's great fun to me to find out about this, since we discovered these
concepts independently one more time in 2001 -- and I actually compared
them to Xu's work and edition back then, but didn't know that they were
based on cryptographic hashes and signatures -- and we actually
published a paper at Hypertext'02 called "Freenet-like GUIDs for
implementing xanalogical hypertext" [1]. Guess that the fundamental idea
wasn't exactly new, then. :-) (Although it's still my understanding that
our approach is significantly simpler to implement than an enfilade- or
ent-based system, without loss of user-level functionality, and I
believe that the efficiency loss doesn't make the system unworkable.)
[1] http://portal.acm.org/citation.cfm?id=513386&dl=ACM&coll=portal
IIUC, Xu also had the notion of a work pointing to another work as its
current edition, for example "Current issue of journal X" -> "January
issue of journal X" -> "Edition n of January issue" (allowing for edits
of the January issue).
> We planned to represent a work by a pair of a signing key and a signature
> verification key. The work would be designated by the signature verification
> key (or a fingerprint of this key). The right to revise a work would be
> based on knowledge of the signing key. To revise a work is to sign the hash
> of an edition (plus some meta info, perhaps identifying predecessor
> editions, I don't remember) as being an edition of this work, and to make
> the resulting signature available. Note that both editions and signed
> revision records can be replicated freely and are location independent.
How did you intend to solve the problem of forgotten / compromised keys?
> Within just these notions, one can't reliably read the *current* state of a
> work. Rather, the best you can do is to get the most recent of those you
> could find. We expected to have something like a netnews flooding algorithm,
> so that this would normally be recent enough for many purposes.
...
> This set of ideas actually works together much better than we had any right
> to expect ;). But the price of "current" remains location dependence. This
> constraint is even harder to escape than we'd feared. The only escape I know
> of is quorum systems of some kind (such as
> http://szabo.best.vwh.net/securetitle.html ), and, IMHO, these may still be
> too complex to be practical.
OceanStore does this, too.
Our idea is to use a DHT to find the most current version available on
the network (of course then there's the problem of data hiding attacks
in DHTs). The idea is that if there's anybody (including the publisher)
interested in users finding the newest version, they keep it alive in
the DHT for clients to find. It may not always be perfect, but it seems
good enough to us -- especially with the tradeoff against
location-dependence.
- Benja
More information about the P2p-hackers
mailing list