I used a slightly different version of the code that uses Find instead of the Select to set the values of the Frequency field. Starting this with a noshow speeded things up and prevented all the blinking screen activity.
find «Record ID»=randominteger(1,10)
The results of several runs were pretty consistent and showed a general graduated frequency return from the highest for 1 and the least for 10 with 1 being returned almost three times as often as 10.
These tests may point to a problem with Select, and Find, rather than RandomInteger(.
This is the procedure I ran.
let x = rep(cr(),999999)
x = arrayfilter(x,cr(),"randominteger(1,10)")
importtext x, "ExistingData","Replace"
I ran it with sample sizes of 10,000 100,000 and 1,000,000 respectively, and got these results.
It takes longer to Find or Select a record at the bottom than it does one at the top. The statement that increments the frequency, may not be waiting for the Find or Select to finish, and it’s incrementing the first record, while it’s still active.
This issue might be the same as Jim Cook’s in disguise.
Hmm, I don’t think so. One would expect measurements of natural phenomena to exhibit a bell shaped curve. But for randomly generated numbers, over large test sizes, the results obtained by Dave are reasonable.
In the examples provided by Gary, one can see a definite pattern, which by definition, is not random.
I can imagine one too. In fact I once created one with a binomial distribution. If randominteger( were intended to do something fancy like that, I’m sure it would have been documented, rather than expect us to use our imaginations, and guess right.
The procedure I wrote, which doesn’t do any finding or selecting is producing results that I believe are consistent with an even probability distribution function.
Yes - with enough samples, as your test showed, the distribution levels out - I once tried the martingale system on a roulette wheel that gave 8 reds in a row (I had bet on black ); the number of trials matters.
But a random bell curve distribution has its place. You might even say it’s normal.
The original example shows code that uses pre-existing data and then counts it in a suspect way, but it’s not really testing the randominteger function directly. Unless we’re sure the data in the OP’s database us actually random. It could in fact be data that falls under the scope of Benford’s law, and thats why the data looks skewed.
Dave’s method seems to be a better test of the function.
Benford’s law applies to numbers that span several orders of magnitude. The examples that Paul Overby, and Gary Yonaites gave had only the numbers 1 through 10, so Benfords law doesn’t really apply.
It’s not supposed to be random. It’s simply 10 records containing the numbers 1 through 10. What they were doing was equivalent to rolling a 10 sided die 10,000 times and counting the number of times each number appeared. Their procedures would Select/Find the record containing the random number generated by randominteger(, and then add 1 to the number in that records Frequency field.
My method was designed to be faster, which allowed me to use larger sample sizes in a reasonable amount of time, but mathematically, the concept was no different from theirs.
I think the skewed results they got were due to a bug in Panorama. It just isn’t the randominteger( function that was the cause. My results weren’t skewed because randominteger( was about the only thing my code had in common with theirs.
There is something you don’t understand about the select statement.
Let’s look at this line:
select «Record ID»=randominteger(1,10)
I think you are assuming that a random integer is calculated, then all records that match that integer are selected. But that is NOT what is happening.
What actually happens is that starts with the first record, calculates a random integer, then checks to see if the first record matches that random integer. Then Panorama goes to the second record, calculates a different random integer, and checks to see if the second record matches this new value. Another random integer is calculated for the third record, another for the fourth record, etc. So it’s very likely that multiple records will be selected. If that happens, only the first of the multiple records will have the frequency bumped, so the records closer to the top of the database will get bumped more. Exactly what you saw.
Here is a fixed version of this code. With this code the random number is calculated once per iteration of the loop, not 100,000 times.
let r = randominteger(1,10)
select «Record ID»=r
Note that I also removed the selectall statement above the select statement. This wasn’t accomplishing anything other than wasting time.
Good point. That’s why the first record was so out of whack, then the others gradually decreased. I think the chance of the first record being incremented was 35%+10%, which is almost exactly the result he got (4483).