[p2p-hackers] References for using hashes as unique identifiers?
Mark S. Miller
markm at caplet.com
Mon Mar 8 18:07:53 UTC 2004
At 08:55 AM 3/8/2004 Monday, Benja Fallenstein wrote:
>Mark S. Miller wrote:
>>At 12:02 AM 3/8/2004 Monday, Brad Neuberg wrote:
>>>That's an interesting reference to Xanadu.
>
>I agree. I was completely unaware that Xu ever used hashes as identifiers.
I hope I was clear that we never actually implemented this. We simply
constrained our semantics to be compatible with our future plans for
cryptographic integrity controls. Or so we thought. (It's clear to me now
that we hadn't succeeded at this.)
>(I do remember Ted talking about "Merkle hash enfilades" somewhere, which
>never made sense to me because WIDative enfilade operations are supposed to
>be horizontally associative, which hashing definitely is not?)
You are correct -- they don't make sense. However, I've recently heard about
a paper (citation?) analyzing use of non-truncated addition as a secure
associative-commutative hash-combining operator. Does anyone know of a
secure associative-but-not-commutative hash combining operator? Such a thing
would be more interesting for some uses.
In any case, we had no *sensible* ideas along those lines at Xanadu.
>>>Can you
>>>talk more about how and why the Xanadu team came up
>>>with this concept in regards to their own vision of a
>>>hypertext web?
>>A link from document X to document Y must somehow designate document Y. It had never occurred to us that the designator should depend on specifying what machine Y is hosted on. That would obviously be insane. Documents must survive longer than their hosting machine and their hosting organization. Besides, popular documents would make for hotspots in the network -- an obviously unscalable architecture.
>
>But the address exposed at least to the client in Udanax Green is the
>tumbler, which contains the hosting organization if not the machine -- how
>did you plan to securely map those to the location-independent cryptographic
>identifiers?
The best schemes we had for tumblers were no better than our current
nightmare of CA hierarchies. That was among the reasons that Udanax Gold
dispensed with the tumbler concept and went to opaque designators.
Tumblers were well loved by some. The decision to kill them was quite
controversial.
>>The architecture now known as Udanax Gold
>...
>>has two concepts corresponding to what we mean by "document": The "work" and the "edition". A link can link to either kind of thing.
>
>Weren't the links to spans of media, and by implication to all works and
>editions that transclude these spans?
>
>This is something I'd really like to understand -- did your system allow
>linking to a spanset in a *particular* work/edition? The above phrasing
>seems to imply that.
Yes it did. However, from your phrasing, I can see that you're referring to
Udanax Green concepts. In Udanax Green, the answer was also yes, but in a
messier way.
>>An edition is an immutable snapshot of the contents of a document as of some moment. We planned to eventually designate an edition by the cryptohash of its contents. Nevermind that we made other decisions that would have made this impossible -- we didn't notice the conflicts at the time. Though we didn't yet know Zooko's phrase "self-authenticating designator", we had the concept, and we thought we were on a course to achieve it.
>
>Essentially, Freenet's CHK (i.e., hash-based key) and SSK (i.e.,
>signature-based key).
>
>How were you going to find who in the network has a copy of the document
>identified by a particular hash? Flooding broadcast, or something like
>Freenet, or did you actually develop distributed hashtables back then, too? :-)
We considered this a hard problem. We were aware of Robin Hanson's ideas
towards DHTs, but couldn't figure out how to make it practical under dynamic
entry and exit. We did not consider full flooding, as we did not imagine a
world with enough disk space for that.
Instead, we postponed that problem by first building a non-distributed
server. Our vague sense was to use market pricing to adaptively move
replicas towards their locations of greater demand. Indeed, a conversation
with Phil Salin around 1980 or so, about use of markets to manage adaptive
replication for Xanadu, was one of the crucial aha's that led to the agoric
open systems papers. http://www.agorics.com/Library/agoricpapers.html
Mariposa http://s2k-ftp.cs.berkeley.edu:8000/mariposa/ shows that the idea
may have some merit.
>It's great fun to me to find out about this, since we discovered these
>concepts independently one more time in 2001 -- and I actually compared them
>to Xu's work and edition back then, but didn't know that they were based on
>cryptographic hashes and signatures -- and we actually published a paper at
>Hypertext'02 called "Freenet-like GUIDs for implementing xanalogical
>hypertext" [1]. Guess that the fundamental idea wasn't exactly new, then.
>:-) (Although it's still my understanding that our approach is significantly
>simpler to implement than an enfilade- or ent-based system, without loss of
>user-level functionality, and I believe that the efficiency loss doesn't
>make the system unworkable.)
>
>[1] http://portal.acm.org/citation.cfm?id=513386&dl=ACM&coll=portal
Cool!
We did indeed *WAY* overestimate the need for efficiency, and *WAY*
underestimate the cost of designing fancy data structures. It was a mistake
to try to engineer the ent on the way to 1.0. Better to get something
inefficient out quickly, and then worry about efficiency later. But I
didn't understand this at the time.
>IIUC, Xu also had the notion of a work pointing to another work as its
>current edition, for example "Current issue of journal X" -> "January issue
>of journal X" -> "Edition n of January issue" (allowing for edits of the
>January issue).
I don't believe so.
>>We planned to represent a work by a pair of a signing key and a signature verification key. The work would be designated by the signature verification key (or a fingerprint of this key). The right to revise a work would be based on knowledge of the signing key. To revise a work is to sign the hash of an edition (plus some meta info, perhaps identifying predecessor editions, I don't remember) as being an edition of this work, and to make the resulting signature available. Note that both editions and signed revision records can be replicated freely and are location independent.
>
>How did you intend to solve the problem of forgotten / compromised keys?
I don't remember. I'm not sure we had a sensible idea for this at the time.
>>Within just these notions, one can't reliably read the *current* state of a work. Rather, the best you can do is to get the most recent of those you could find. We expected to have something like a netnews flooding algorithm, so that this would normally be recent enough for many purposes.
>...
>>This set of ideas actually works together much better than we had any right to expect ;). But the price of "current" remains location dependence. This constraint is even harder to escape than we'd feared. The only escape I know of is quorum systems of some kind (such as http://szabo.best.vwh.net/securetitle.html ), and, IMHO, these may still be too complex to be practical.
>
>OceanStore does this, too.
>
>Our idea is to use a DHT to find the most current version available on the
>network (of course then there's the problem of data hiding attacks in DHTs).
>The idea is that if there's anybody (including the publisher) interested in
>users finding the newest version, they keep it alive in the DHT for clients
>to find. It may not always be perfect, but it seems good enough to us --
>especially with the tradeoff against location-dependence.
Cool.
--
Text by me above is hereby placed in the public domain
Cheers,
--MarkM
More information about the P2p-hackers
mailing list