[p2p-hackers] Decentralized search engines
David Barrett
dbarrett at quinthar.com
Fri Dec 9 02:32:06 UTC 2005
SIMON Gwendal RD-MAPS-ISS wrote:
> This
> information space does not initially contain the web. Our idea is to
> consider that the cache (or history) of the web browser should be, by
> default, included in the published set of documents.
I assume you have a good answer for this, but how will you prevent (for
example) cached copies of Hotmail from ending up in your system?
Also, is there any way to correlate an actual web URL with content in
your system? For example, could you do a search in your system, it
finds a cached webpage, and then offer a "(www)" link that points back
to the original page?
Finally, is there any way to create a "private" subset of the network,
so (for example) everyone in my company can use this to get quick access
to everyone else's documents, but nobody outside my company can use it
to get in?
Regardless, this looks fantastic.
If you're in a US-centric business frame of mind, you might consider
using this to ensure Sarbanes-Oxley conformance. Especially if it were
scriptable -- have a series of "kill words" that should never appear in
any document anywhere in a company (including a specific customer name,
or a case number, or whatever). Then have a server loop through the
kill list every night and raise a red flag if it finds a document that
shouldn't exist.
-david
More information about the P2p-hackers
mailing list