[p2p-hackers] paper "rarest first and choke algorithms are enough"

Ranus networksimulator at gmail.com
Thu May 4 03:36:16 UTC 2006


My thoughts on the two issues Bryan brings:
 
On fairness
 
Both are saying something true. The point is whether the experiments carried
out capture any situation that seeders are scarce. 
 
But generally I don't think free-riders can be eliminated even with a
byte-for-byte fairness heuristic. For example, free-riders contributes
nothing, but if their downloading bandwidths are also small (compare with
other leechers), the fair share offered by seeders can well saturate this
capacity. Also joining peers without any information of the free riders and
optimistic unchoking inevitably feed the free riders. [1] tries to replace
optimistic unchoking by estimate the bandwidth between peers, how well does
this work? I don't know if any better approaches can solve this problem.
 
The good thing is, bitTorrent introduce seeders to provide extra resources.
And if we want to prove/disprove the current algorithm is not efficient esp.
when uploading capacity is rare, two things should be demonstrated: 
1. whether peers can pick up optimal remote downloaders in the neighborhood
quickly. 
2. If they can do 1, how much performance loss they undertake by the current
optimistic unchoking algorithm; if they cannot, how far from optimal do we
expect the downloader selection.
 
 
On Network Coding
 
I am not really familiar with network coding, the only example in P2P I read
is "Network Coding for Large Scale Content Distribution" [2]. But network
coding often requires a global view or good heuristics on what to generate.
In [2], to generate highly needed (innovative in this article) needs
communication among the neighbors. Anybody knows some real-world,
large-scale P2P system that provides realistic data?
 
 
 
Nice article, Arnaud, I think I've read your previous article on BitTorrent
months ago. Very solid data, and I'm curious on your methodology in choosing
the sessions (as they should be representative).
 
[1] AR Bharambe, C. Herley, and VN Padmanabhan, "Analyzing and. improving
bittorrent performance," Technical Report, MSR-TR-2005-03
[2]  <http://research.microsoft.com/~pablo/papers/nc_contentdist.pdf>
http://research.microsoft.com/~pablo/papers/nc_contentdist.pdf

--
Ranus Yue
Tsinghua University


 



  _____  

From: p2p-hackers-bounces at zgp.org [mailto:p2p-hackers-bounces at zgp.org] On
Behalf Of Bryan Turner
Sent: Thursday, May 04, 2006 1:57 AM
To: Peer-to-peer development.
Subject: Re: [p2p-hackers] paper "rarest first and choke algorithms are
enough"


Arnaud,

    Thanks for the new paper, there is a lot of nice data here.  The
presented graphs and analysis indeed suggest that the BitTorrent protocol's
choices achieve near-optimal data distribution in its paradigm (single file,
block-based content distribution).  I agree with your suggestions for
fairness criteria, and my own suggestions are similar (my suggestion is for
a reputation-based system to model leechers and free-riders). 

    However, I disagree with several conclusions you make along the way.  A
brief overview:
- Byte-for-byte fairness is not appropriate
- Network Coding is not justified

Fairness:

    I am unconvinced by the arguments against byte-for-byte fairness.  Your
evidence to support this is that tit-for-tat does not take into account the
excess capacity of some torrents (ie: more seeders than leechers).  Yet this
situation is equivalent under byte-for-byte and choking rules. Leechers are
upholding the byte-for-byte policy (seeders cannot, as you mention) and thus
only leechers are restricting their upload capacities to free-riders, and
rightly so.  Seeds (who hold the excess capacity) are distributing this
capacity fairly to all peers.  Thus, choking and byte-for-byte are
equivalent in this regard - free riders receive a share of the bandwidth
equal to the fair distribution given by seeders in both cases. 

    The more important case, when excess capacity is not available, free
riders should be eliminated or diminished.  Because leecher participation
cannot be proven, the only other method of measurement is via equal exchange
of services.  Byte-for-byte is one such measurement (reputation systems are
another).  In this scenario, free riders will only receive a small credit
from the leechers in the swarm before having to resort to hand-outs from the
seeds.  With the seeds giving fair capacity to the swarm, the free riders
will receive very little of the swarm capacity, while the participating
leechers will receive fair resources. 

    Free riders always receive more resources under choking than under
exchange measurement models.

Network Coding:

    On the topic of Network Coding [1], I did not see any evidence in your
paper refuting the network coding literature.  Please correct me if I missed
it, but it appears your only rebuttal is in the poor modeling of BitTorrent,
not in the application of network coding techniques.  From this I don't see
how you conclude that network coding is not justified?  I agree that most of
the simulation literature does not model BitTorrent correctly, but this
doesn't negate the positive results of the network coding trials, only their
comparisons to BitTorrent. 

   In section 4-A, pp 6-7, you suggest that source or network coding may not
propose interesting pieces to peers during the startup phase.  I counter
that this is exactly what network coding solves - every packet has an
innovation rate based on the network flows from the local node to the
source(s).  Assuming a densely connected graph with fair sharing from the
seeder, it is unlikely that any packet is not innovative during the startup
phase.  Unless the local node's bandwidth is many multiples of the seeder, a
problem which is not specific to network coding. 

   Network Coding is superior to Rarest First and other BitTorrent-like
protocols for several reasons:
1) The innovation rate is the min-cut of all flows from the local node to
the source(s), which is likely to be much larger than BitTorrent's default
of 4 active connections. 
2) BitTorrent transfers pieces in blocks-at-a-time, but only advertises in
piece-at-a-time.  This injects a significant delay between downloading of an
innovative byte and sharing of that byte (on average, one half the download
time of an entire piece).  Network coding imposes no such delay. 
3) There is no first-blocks or last-blocks problem with network coding, they
are entirely non-issues.  Large portions of your paper are devoted to these
issues which simply don't occur under network coding.
4) The meta-network utilization is higher for network coding.  That is,
since BitTorrent cannot align to the underlying physical connections of the
network elements, it is impossible to maximally utilize them.  Network
coding (using an innovation rate tracking algorithm) can rapidly align to
the structure of the physical network, eliminating cross-core
retransmissions.  This saves a significant amount of ISP bandwidth and
reduces the overall burden of supporting filesharing services on the
internet. 

Of course, network coding does have cons also:
- Typically larger memory footprint
- Typically larger CPU requirements (falling rapidly with new research)
- Typically more complex supporting protocols (innovation tracking, security
checks, etc) 

    Is the full implementation of Network Coding superior enough to
BitTorrent that it offers compelling reasons for migration to it as a
solution?  Your paper argues that BitTorrent is "good enough", and I tend to
agree. 

Please discuss!  :-)
--Bryan
bryan.turner at pobox.com

[1] Network Coding: An Instant Primer -
http://infoscience.epfl.ch/search.py?recid=64508
<http://infoscience.epfl.ch/search.py?recid=64508> 



On 5/3/06, Arnaud Legout <Arnaud.Legout at sophia.inria.fr> wrote: 

Hello,

we have a new paper called "rarest first and choke algorithms are enough"
that is about an experimental evaluation of the two core BitTorrent
algorithms
on real torrents.

A. Legout, G. Urvoy-Keller, and P. Michiardi. Rarest First and Choke 
Algorithms Are Enough. Technical Report
(inria-00001111, version 1 - 13 February 2006), INRIA, Sophia Antipolis,
February 2006.
http://hal.inria.fr/inria-00001111/en
<http://hal.inria.fr/inria-00001111/en> 

Comments are welcomed.

Regards,
Arnaud Legout.

--
Arnaud Legout, Ph.D.

INRIA Sophia Antipolis - Planète  Phone : 00.33.4.92.38.78.15
2004 route des lucioles - BP 93   Fax   : 00.33.4.92.38.79.78 
06902 Sophia Antipolis CEDEX      E-mail: arnaud.legout at sophia.inria.fr
FRANCE                            Web   :
http://www-sop.inria.fr/planete/Arnaud.Legout/index.html



-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://zgp.org/pipermail/p2p-hackers/attachments/20060504/99f4b3ec/attachment.htm


More information about the P2p-hackers mailing list