[p2p-hackers] Re: The Content-Addressable Web

Gerald Oskoboiny gerald at impressive.net
Thu Oct 25 11:35:02 UTC 2001


On Thu, Oct 25, 2001 at 01:33:48AM -0500, Justin Chapweske wrote:
> I've just finished the first draft of "HTTP Extensions for a 
> Content-Addressable Web".  I believe that these simple extensions are a 
> huge step forward in providing interopability between P2P systems.
: 
> http://onionnetworks.com/caw/caw.txt

Interesting stuff...

> HTTP Extensions for a Content-Addressable Web
> Justin Chapweske, Onion Networks (justin at onionnetworks.com)
> October 7, 2001
> 
> Abstract
> 
> The goal of the Content-Addressable Web (CAW) is to create
> a URN-based Web that can be optimized for content distribution.
> The use of URNs allows advanced caching techniques to be
> employed, and sets the foundation for creating ad hoc Content
> Distribution Networks (CDNs). This document specifies HTTP
> extensions that bridge the current Location-Based Web with
> the Content-Addressable Web. 

The Web "is the universe of network-accessible information" [1],
i.e. anything with a URI, including URIs that are not tied to a
particular hostname.

You might find this useful:

    URIs, URLs, and URNs: Clarifications and Recommendations 1.0
    Report from the joint W3C/IETF URI Planning Interest Group
    W3C Note 21 September 2001
    http://www.w3.org/TR/uri-clarification/

it attempts to clarify confusion about URIs, URLs, and URNs.

> 2 Self-Verifying URNs
> 
> While any kind of URN can be used within the Content-Addressable
> Web, there is a specific type of URN called a "Self-Verifying
> URN" that is particularly useful. These URNs have the
> property that the URN itself can be used to verify that
> the content has been received intact. It is RECOMMENDED
> that applications use cryptographically strong self-verifying
> URNs because hosts in ad hoc CDNs and the Transient Web
> are assumed to be untrusted. For instance, one could hash
> the content using the SHA-1 algorithm, and encode it using
> Base32 to produce the following URN:
> 
> urn:sha1:RMUVHIRSGUU3VU7FJWRAKW3YWG2S2RFB

I think URIs based on sha-1 hashes are a fantastic way to identify
resources in P2P systems, and I don't understand why most of the
P2P systems I have used don't work like this already. (apparently,
anyway; I haven't studied the protocols, but that's my impression
as a user.)

When I search a P2P system for a particular file, I think the search
results should be a list of filenames, content-lengths, and sha-1
hash URIs, and maybe some other stuff.

Then the P2P client should present the results to me grouped by
their sha-1 hashes, let me sort those results by size/filename/
number-of-peers-with-that-hash, and when I pick one of them, it
should start downloading ranges of the file from each of, say, 10
of the peers that claim to have files with that hash.

Then it should continue downloading other ranges of the file from
the peers that gave the highest throughput on the first transfer.
(and terminate the really slow ones prematurely.)

Also, it might be good for P2P systems to be able to use an external
resolver for these URIs, so I can have a general sha-1 URI resolver
on my desktop that gets used by any P2P/Web clients I might use,
and I can set its resolution strategy according to my preferences.

> 3 HTTP Extensions
> 
> In order to provide a transparent bridge between the URL-based
> Web and the Content-Addressable Web, a few HTTP extensions
> must be introduced. The nature of these extensions is that
> they need not be widely deployed in order to be useful.
> They are specifically designed to allow for proxying for
> hosts that are not CAW-aware.

I haven't reviewed this section closely, but you might want to see:

    http://www.w3.org/Protocols/HTTP/ietf-http-ext/

for info on HTTP Extensions.

[1] About the World Wide Web
    http://www.w3.org/WWW/

-- 
Gerald Oskoboiny <gerald at impressive.net>
http://impressive.net/people/gerald/



More information about the P2p-hackers mailing list