[p2p-hackers] Node counting algorithm

Gwenchlan gwenchlan at fr.fm
Thu Feb 24 08:37:35 UTC 2005


Le mercredi 23 février 2005 à 21:32 -0800, Sean C. Rhea a écrit :

>On Feb 23, 2005, at 10:27 AM, Michael Parker wrote:
>> If your leaf set overlaps, it's just the number of entries in your 
>> leaf set.
>> If your leaf set does not overlap, divide the size of the ring (e.g. 
>> 2^128, 2^160) by the span of your leaf set (i.e., the farthest 
>> clockwise node minus the farthest counterclockwise node, modulo the 
>> ring size), and multiply by the size of your leaf set. Basically, what 
>> this means is if your leaf set is size L, and it spans a percentage x 
>> of the node identifier space, the size of the network is approximately 
>> L * x^-1. To improve accuracy, ask the two farthest nodes in your leaf 
>> set and ask them for their leaf sets, merging them into yours before 
>> calculating. That way, you have a larger effective L.
>
>This technique gives estimates that, on average, overestimate the size 
>of the network if you pick node identifiers uniformly at random (UAR).  
>The reason is that UAR doesn't mean evenly distributed; some nodes' 
>leaf sets cover much more than others.  If you have one node whose leaf 
>set covers a larger portion of the key space, that node underestimates 
>the size of the ring, but a lot of other nodes end up covering less of 
>the key space (to make room for the larger one) and end up 
>overestimating the network size.  When you average them all, the few 
>nodes that underestimate don't make up for all the rest that 
>overestimate it.
>
>IANAM (I am not a mathematician), but this is what happens when you 
>simulate it at least, with identifiers drawn using SHA.  References for 
>more mathematical explanations and a better algorithm are described in 
>this paper:
>
>   http://iptps05.cs.cornell.edu/PDFs/CameraReady_174.pdf
>
>Sean
>
Thanks for these DHT heuristics. I have omited to specify i was looking 
for tricks applying to unstructured networks (probably random graphs).
Thus we cannot deal with density here. I was thinking about using a 
flexible method like randow walk, but K random walkers launched by 
initiator raise the problem of scalability and accuracy as the overlay 
may (more or less) vary during the counting itself, phenomenon amplified 
by network size...




More information about the P2p-hackers mailing list