[p2p-hackers] Identity Crisis: Anonymity vs Reputation in P2P Systems

Sam Joseph sam at neurogrid.com
Thu Mar 18 19:01:39 UTC 2004


Hi All,

So another interesting paper I found is

"Identity Crisis: Anonymity vs Reputation in P2P Systems"
Marti & Hector, 3rd P2P conference
http://dbpubs.stanford.edu:8090/pub/2003-41

I think this is a solid paper, and its main thrust seems to be a 
comparison between central authority identity versus self managed 
identity in a gnutella-like file sharing system.

Interestingly they use metadata to mediate their concept of 
authenticity, e.g. we might have a document with meta-data as follows

Metadata
Title: A Tale of Two Cities
Author: Charles Dickens
Publish Date: April 2002
Publisher: Barnes & Noble Books
...
Content
It was the best of times, it was the worst of
times...

"In general, a document is considered authentic if and only if its 
metadata fields are “consistent” with each other and the content. If any 
information in the metadata does
not “agree” with the content or the rest of the metadata, then the 
document is considered to be inauthentic, or fake. In the example above, 
if the Author field were changed to Charles Darwin, this document would 
be considered inauthentic, since Barnes & Noble Books has never 
published a book titled A Tale of Two Cities written by Charles Darwin 
that begins “It was the best of times..."

In Marti and Hector's approach Authentication is something that is 
performed by a centralised authority, and one point of comparision for 
them between self-managed and centralised identity is how frequently is 
authentication required They describe this in terms of the verificaton 
ratio, the number of times authenticity has to be verified, divided by 
the number of successful queries.

I would recommend reading this paper, but to summarise some of their 
results, they look at the effects of different strategies such as always 
selecting the most reputable source, versus a weighted selection, as 
well as considering different values of "default" reputation and trust 
thresholds.

What they find is that in their simulations the efficiency is roughly as 
follows:

LoginBest > LoginWeighted > SelfMgdBest > SelfMgdWeighted

Where Best and Weighted refer to the download selection strategies, and 
Login and SelfMgd refer to centralised and decentralised identity 
managment respectively.
We should note that this inequality above is generalising over some 
interesting variations.

One other point of interest is that way in which login and selfmgd are 
implemented in the simulation. Login means nodes can't shake their 
identities, and selfmgd means that once a node cheats once it discards 
its identity and starts over.

Anyways, I think the interplay between metadata and trust here was 
interesting ....

CHEERS> SAM






More information about the P2p-hackers mailing list