[p2p-hackers] Generalizing BitTorrent..

Bryan Turner bryan.turner at pobox.com
Fri Jan 14 17:29:01 UTC 2005

	I apologize to Vaste for missing his earlier post on this topic
(http://zgp.org/pipermail/p2p-hackers/2004-December/002290.html).  We're
definitely thinking along the same lines.

> I guess I am suggesting that you might be able to do this without
> changing the Bittorrent protocol. Perhaps you could suggest how
> protocol changes would improve efficiency over a scheme based on
> conventions similar to those described above.

I believe the protocol needs to change for several reasons:

1.  The Bit Torrent protocol exchanges 'piece lists' between clients as a
bit vector for a particular torrent.  This bit vector (and later 'update'
messages) require that each client have the exact same torrent as the
clients they are connected to.  It is impossible to generalize this to piece
lists for meta-torrents without changing the protocol.

2.  Trackers track torrents by their torrent IDs, and gather statistics in
aggregate for a specific torrent.  This information in its current form
can't be generalized for meta-torrents.

3.  The protocol assumes only one torrent is being transferred at a time
(new versions and advanced clients remove this restriction, but not
intelligently).  If a user is interested in several files, it should seek
out the clients which are interested in the SAME files, and keep them
preferentially over other clients.  This is not possible using the current
set of messages in Bit Torrent.


	Since there seems to be some interest, here's how I would change

	First, the torrent file should be thought of as a catalog of pieces
to retrieve, and instructions on how to paste them together into a
collection.  The trackers and clients don't care about the final glue-up,
they simply locate and exchange pieces based on their piece ID.

	Second, piece IDs should be their Content Hash.  This guarantees
that overlapping catalogs of pieces will share the same pool of peers.
Since peers are looking for pieces by content hash, they don't care which
torrent their trading partner is interested in, only that the data is the

	Third, trackers would be generalized to track pieces instead of
torrents.  Clients register their interest in a piece at the tracker, and
the tracker returns a bucket of peers who are interested in the same piece
(instead of peers interested in the same torrent).

	Forth, the piece list exchange needs some way to exchange all the
piece IDs that the client is interested in.  I propose Bloom Filters, and
one of the optimized set reconciliation protocols in the literature.  Thus,
when two peers meet, they calculate their shared interest.  This could even
be done using 'fuzzy' math to get an approximation of the shared interest.
If it is high enough, they could complete the full exchange.

	Content Hashes lead naturally into.. DHTs.  So trackers could build
a P2P network (Chord, Pastry, etc..) where the keys are piece IDs and the
value returned by the DHT is a random list of other peers looking for that
piece.  Client software bootstraps by selecting a sample of pieces from all
the torrents you are interested in, performs lookups to gather a list of
potential peers, then filters for the peers with similar interests.

	As your interests change, or as clients come & go, you can
re-register your interest in pieces, or lookup another bucket of peers from
the DHT to trade with.

	'Seed' peers register their interest in all of the pieces they are
seeding, and are naturally found by the peers who perform lookups for those

	Finally, the peers could also be Trackers simply by joining the DHT
and taking some of the load of handling lookup requests.  Thus following the
eXeem model of distributing trackers across all the peers.

I hope that came out legibly..
bryan.turner at pobox.com

More information about the P2p-hackers mailing list