[p2p-hackers] zipf's law
Hal Finney
hal at finney.org
Fri Jun 17 06:53:09 UTC 2005
John Casey writes:
> Thanks Hal I have just tried the binning technique but it doesn't seem
> to give very good results. On sorting the random numbers the
> distribution still seems to be very skewed which I am guessing is due
> to the integer nature of the binning process.
>
> http://www3.it.deakin.edu.au/~jacasey/bin-transform.gif
>
> I might go back to my older idea of using cern's zipf random generator
> save all the variables and then normalize them all by the largest
> number generated into the range [0,1]
>
> I have attached inline the java code that I used to generate that
> graph. Are there any obvious or not so obvious errors ??
Your program looks correct, although it's a little inefficient in how
it calls harmonic for each bin. I ran it and counted how many outputs
it produced of each bin number and it seemed to be a good Zipf distribution.
Here are the first 20 bins:
1963 1
922 2
611 3
478 4
386 5
309 6
288 7
229 8
214 9
197 10
185 11
156 12
155 13
134 14
149 15
113 16
112 17
113 18
89 19
89 20
The first number is the number of times we got that output and the second
is the bin number. As you can see, 2 happens about half as often as 1,
3 happens about 1/3 as often, 10 happens about 1/10 as often, and so on.
I'm not sure what your graph was supposed to show but maybe you had a
mistake in your plotting program. I'd say this output looks great.
Hal
More information about the P2p-hackers
mailing list