[p2p-hackers] censorship resistance and anonymity (was: newbie mnet questions...)

Oskar Sandberg oskar.s at gmail.com
Mon Sep 27 19:54:50 UTC 2004


On 26 Sep 2004 09:04:11 -0300, Zooko Wilcox-O'Hearn <zooko at zooko.com> wrote:
> I'm thinking about the difference between "censorship resistance" and
> "anonymity".

As I have attempted to define it, these are two different things. In
real life, of course, one may argue that they are not - as long as
people are made of meat that can be injured having your identity
exposed means you can be censored. But when one is trying to look at
the structure of data publishing networks, it is (IMHO) useful to try
to narrow the definition of "censorship resistance" to just the
ability for an attacker to make published data unavailable. This is
typically something different than author/reader anonymity, but may be
related to storer anonymity (since if you cannot tell who is storing
data, it is difficult to make it unavailable). I'll deal with the
issues you bring up with CR and anonymity separately, especially with
regards to Freenet (though it should be noted that I am no longer
actively working on the main Freenet project, and my opinions should
not be seen as the opinions of those who do).

1) Censorship Resistance

With this narrowed as described above, I propose the formal definition
of CR (*) as the ratio between the complexity of an attack to make a
given piece of data unavailable, and the complexity of an attack to
knock out the entire network. In this sense, a network would be fully
censorship resistant (CR-1 implying a ratio 1) if removing any data
meant knocking out the entire network, while it would have no
resistance (CR-1/n) if knocking out a single node in the network was
enough to remove some data. The order of the ratio is most important,
so I'll use CR-f(n) for anything with ratio > c f(n) for some constant
c.

Both CR-1 and CR-1/n networks exist: if every document is stored at
every node, then the network is automatically CR-1, and a normal file
sharing network without caching is CR-1/n.  Whether a fully caching
network is the only possible CR-1 network is an interesting question -
I would think that without making special assumptions about the
capabilities of the attacker it probably is.

In this context, I don't believe that duplicating data by erasure
coding does much at all. The network is CR-1/n even if each piece of
data is stored in x places, and that won't change unless you start to
put some rather unrealistic restrictions on the attacker (ie, he can
only attack y nodes in k hours or something like that). It is sort of
like the difference between a checksum and a secure one-way hash. The
checksum is good to ensure that two documents do not accidentally have
the same value, but against an intentional attack with that purpose it
is useless. In the same sense erasure codes are good at avoiding data
accidentally becoming unavailable, but useless on their own against an
attack for just that purpose.

The question can be raised whether Freenet is a lot better in this
respect. The truth is I don't know exactly. Freenet certainly tries
harder to make removing data difficult then most other networks, but
whether that is actually successful is anybody's guess. I would be
very curious to hear if anybody has strategies (beyond just pure
caching) by which P2P data publishing networks can be made more
resistant to censorship (even CR-1/sqrt(n) would be pretty good!)

(*) If there is already a formal definition of this I will gracefully
go hide in the intellectual corner of shame.

2) Anonymity

I think you are correct in pointing out that Freenet's attempt to do
routing and mixing at the same time ends up with pretty poor
anonymity, but I think you are using the wrong arguments. I think that
even though Freenet does not spend as much effort on it as some of the
truly paranoid mixnets, its current resistance to traffic analysis is
pretty good. Data and messages are padded in size, and with the
traffic levels being high (on the order of thousands of queries
flowing through nodes every hour) I think timing analysis would be
pretty difficult. Perhaps possible for an extremely determined and
resourceful attacker, but certainly more difficult than other methods.

Freenet's basic problem is that the anonymity it offers is, at the
very best, the "Crowds" model. All queries are forwarded in the open
(they have to be, as each node needs to route them, and thus know what
the query is), so the only anonymity left is deniability: initiating
nodes can claim they were just passing the query. This is pretty weak
to start with (personally, I would rather not be made a suspect at
all, then be a suspect who can deny it) and fails in Freenet for two
reasons:

1) Different queries in Freenet are correlated. Websites contain many
elements, and when I fetch one from Freenet my node will forward
queries for all the elements to it's neighbors. It might be possible
to claim that I was forwarding just a single of those queries, but
that all the queries for a data elements in a single site would have
been forwarded to me at the same time is highly unlikely. This is made
worse by the fact that Freenet uses splitting and erasure coding on
large files: download a movie and it might have thousands of parts, if
just one of your neighbors noticed that he got queries for 100 of
those parts from you, he will have good reason to believe you
initiated all of them.

2) Rather stupidly, Freenet queries SAY when they were started. Or
rather, they say through how many more hoops they should be routed,
but in practice this amounts to more or less the same thing. There is
a default value at which clients initiate queries, and if you get a
query with "hops to live" one less then that you can be pretty sure
the guy who sent it to you wasn't just forwarding.

Given these issues, I think I have gotten the current developers of
the main Freenet application to agree that the only way it will really
become anonymous is by adding a couple of steps of true onion based
mix routing before the Freenet part kicks in. This would be a system
much like what you (Zooko) was suggesting (if I understood correctly)
and I believe this is currently the long term plan for author/reader
anonymity in Freenet.


I wrote pretty long document about these (and other issues) with
Freenet a couple of weeks ago. People can read it here:

http://www.math.chalmers.se/~ossa/fnet.pdf

if they are interested. It should be noted that there has been some
progress since I wrote to rectify some of the problems with Freenet's
development model that I point to, so all those criticisms are not
valid any longer.

// oskar



More information about the P2p-hackers mailing list