[p2p-hackers] zipf's law

Hal Finney hal at finney.org
Fri Jun 17 06:53:09 UTC 2005


John Casey writes:
> Thanks Hal I have just tried the binning technique but it doesn't seem
> to give very good results. On sorting the random numbers the
> distribution still seems to be very skewed which I am guessing is due
> to the integer nature of the binning process.
>
> http://www3.it.deakin.edu.au/~jacasey/bin-transform.gif
>
> I might go back to my older idea of using cern's zipf random generator
> save all the variables and then normalize them all by the largest
> number generated into the range [0,1]
>
> I have attached inline the java code that I used to generate that
> graph. Are there any obvious or not so obvious errors ??

Your program looks correct, although it's a little inefficient in how
it calls harmonic for each bin.  I ran it and counted how many outputs
it produced of each bin number and it seemed to be a good Zipf distribution.
Here are the first 20 bins:

1963 1
 922 2
 611 3
 478 4
 386 5
 309 6
 288 7
 229 8
 214 9
 197 10
 185 11
 156 12
 155 13
 134 14
 149 15
 113 16
 112 17
 113 18
  89 19
  89 20

The first number is the number of times we got that output and the second
is the bin number.  As you can see, 2 happens about half as often as 1,
3 happens about 1/3 as often, 10 happens about 1/10 as often, and so on.

I'm not sure what your graph was supposed to show but maybe you had a
mistake in your plotting program.  I'd say this output looks great.

Hal



More information about the P2p-hackers mailing list