[p2p-hackers] Re: [decentralization] The Content-Addressable Web

Gordon Mohr gojomo at usa.net
Thu Oct 25 16:18:01 UTC 2001


[bluesky dropped, as it's not accepting my non-subscriber messages]

Mark Baker writes:
> > However, the latest W3C "clarification"
> > seems to simply confirm the practice of using some URIs as if
> > they were URNs.
> 
> Even HTTP URIs.
> 
> Yes, that debate still exists.  In general, REST proponents
> believe that any URI is only as persistent as the authority is
> willing to make it.  The dependance of the HTTP URI scheme on
> DNS, and that an authority doesn't "own" it, is often mentioned
> as a reason why HTTP URIs are less persistent.  But that ignores
> the fact that for urn:<nid>:foo, the registrant doesn't own
> "nid" either.  If IBM runs out to register "microsoft" as a NID,
> you can guarantee WIPO will eventually get involved.  It's just
> another central registry.

But with identifiers which are inherent, rather than assigned,
it doesn't matter who registers/"owns" something. Only the
consensus definition of the process for creating the name from
the content matters.

> > If the names are self-verifying, as with secure hashes, you 
> > don't have to make a trust/no-trust decision about caches,
> > at least not at the outset. You can make the trust/no-trust
> > decision on what they give you. Transgressors are always
> > caught.
> 
> I have an issue with the name "self-verifying".
> 
> If I do a GET on urn:sha-1:234234KJASDFKAJFD, I don't know that
> I'm getting back a resource with that hash.  I still have to
> run the hash and compare it to the URI, because the cache may
> be lying.  So the honus of verification lies with the client,
> and therefore the URI isn't self-verifying.

Turn the problem around. You have a resource. You want 
to share it. Sure, you can give it an arbitrary name 
in the HTTP URL namespace, under some hostname you 
control. 

But then no one will know whether what you have is the 
same thing as what they have. Third parties looking for
your exact file cannot tell, from just a legal HTTP 
URL, if what you're offering is what they seek.

You could start a convention for embedding a unique, 
location-independent identifier into your URLs -- as 
RFC2169 suggests, or you suggest later in your message.
I believe that is a useful approach.

But then, you now have something else interesting to 
advertise -- the "true name" of the content. In fact, you 
don't care at all about the domain-name and request-URI 
stuff, you'll let those take any value, as long as the 
reliable name matches and checks out.


> But if it's still really important for this system to have the
> hash in the URI, how about this;
> 
> http://foobar.org/my_content?sha1hash=32452345ASDASDFASDFS

The "N2R" resolution services discussed in RFC2169 and CAW are
very much like you suggest. 

In the above, "sha1hash=32452345ASDASDFASDFS" is a unique,
location-independent, durable name for the content. So why
not call it an URN, and make its qualities explicit in the
syntax? Like:

    http://foobar.org/my_content?name=urn:sha1:32452345ASDASDFASDFS

And then, when you are completely indifferent about where
you get the matching content -- indifferent about location,
protocol, everything -- why should an application keep shuttling 
the "http://foobar.org/my_content?name=" portion around?

Just go with:

    *?name=urn:sha1:32452345ASDASDFASDFS

or better yet

    urn:sha1:32452345ASDASDFASDFS

> > You don't want a "mostly acceptable" MP3 to have the same 
> > reliable name as the "official" or "consensus 'best'" version.
> 
> HTTP handles that with variants (different representations of the
> same resource).  Variants can have their own URIs too.  So you
> could have;
> 
> http://myfavband.org/song/myfavsong (the main URI)
> 
> plus these variants;
> 
> http://myfavband.org/song/myfavsong/mp3/256k
> http://myfavband.org/song/myfavsong/mp3/128k
> http://myfavband.org/song/myfavsong/wav/56k
> http://myfavband.org/song/myfavsong/au/8k

These names still don't come close to uniquely identifying 
specific instances. (There are, for example, trillions of
equally-valid MP3 encodings of a song.) When you want a
specific digital file, one that is an exact copy of an
original/official/recommended version, you want a precise
name.

> > I'd say you want one (a hash-based URN) to serve as the resource's
> > unfalsifiable "true name". You might want several others (traditionally,
> > URLs) to reflect the resource's current reachable locations, or 
> > its names within alternate delivery systems. 
> 
> Do you have an example of a resource that is best named by its
> hash?

  - a recipe for a chemical process that could
    be dangerous if any of the steps or quantities
    are slightly altered
  - a 2-hour video you want to grab as equal parts
    from 120 different sources, with no glitches
  - a compiled executable for which versions with 
    malicious code might be floating around

- Gordon




More information about the P2p-hackers mailing list